A framework for practical parallel fast matrix multiplication

Benson, Austin R.; Ballard, Grey

doi:10.1145/2858788.2688513

Cited by 50 publications

(82 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We also use heuristics in order to encourage sparsity in the solutions, and a fortunate by-product of the sparsity is that the nonzero values often tend towards a discrete set of values from which an exact decomposition can be recognized. Our methods are based on techniques that have proved successful in discovering generic exact decompositions (those that have no noticeable symmetries) [1,22]; we summarize this approach in Section 3.1. Our search process can be divided into two phases.…”

Section: Discussion On the Numerical Methods Usedmentioning

confidence: 99%

“…We discuss effective choices for the regularization parameters in Section 3.3. This method works for the cases of non-square matrix multiplication and has been used to discover exact rank decompositions for many small cases [1,22], but it does not encourage solutions to reflect any symmetries.…”

Section: Discussion On the Numerical Methods Usedmentioning

confidence: 99%

“…They are such that v i,i+1 (u i+2 ) = 1, v i,i+1 (u i+3 ) = −1 (indices considered mod four). This has the advantage of v i+1,i+2 = a −i 0 v 12 where a 0 is as in (1). For v 13 there was no obvious choice of sign, and we chose v 24 = a 0 −1 (v 13 ).…”

Section: Configurations Of Points In Projective Spacementioning

confidence: 99%

See 2 more Smart Citations

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Ballard

Ikenmeyer

Landsberg

et al. 2019

Journal of Pure and Applied Algebra

Self Cite

View full text Add to dashboard Cite

This is the second in a series of papers on rank decompositions of the matrix multiplication tensor. We present new rank 23 decompositions for the 3×3 matrix multiplication tensor M ⟨3⟩ . All our decompositions have symmetry groups that include the standard cyclic permutation of factors but otherwise exhibit a range of behavior. One of them has 11 cubes as summands and admits an unexpected symmetry group of order 12.We establish basic information regarding symmetry groups of decompositions and outline two approaches for finding new rank decompositions of M ⟨n⟩ for larger n.PO Box 7311,

show abstract

Section: Discussion On the Numerical Methods Usedmentioning

confidence: 99%

Section: Discussion On the Numerical Methods Usedmentioning

confidence: 99%

See 1 more Smart Citation

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Ballard

Ikenmeyer

Landsberg

et al. 2019

Journal of Pure and Applied Algebra

Self Cite

View full text Add to dashboard Cite

show abstract

“…Ballard, Demmel, Holtz, Lipshitz, and Schwartz [15], Ballard, Demmel, Holtz, and Schwartz [16], and Lipshitz, Ballard, Demmel, and Schwartz [77]. Recent engineering work includes Benson and Ballard [23] and Huang, Rice, Matthews, and van de Geijn [65]. Our work differs from these works in that we seek a self-contained proof-of-concept demonstration requiring good-performance finite-field matrix multiplication on GPUs, but we do not necessarily seek the most optimized possible implementation; such optimizations are left for future work.…”

Section: Fast Matrix Multiplicationmentioning

confidence: 99%

Engineering a Delegatable and Error-Tolerant Algorithm for Counting Small Subgraphs

Kaski

2018

2018 Proceedings of the Twentieth Workshop on Algorithm Engineering and Experiments (ALENEX)

View full text Add to dashboard Cite

We study the problem of counting the number of occurrences of a given six-vertex pattern graph S in an n-vertex host graph H. We engineer an open-source GPU implementation of a distributed algorithm design of Björklund and Kaski [PODC 2016] where (i) the execution of the algorithm can be delegated [Goldwasser, Kalai, and Rothblum, J. ACM 2015] to produce a noninteractive probabilistically checkable proof of correctness, and (ii) the execution of the algorithm when preparing the proof tolerates a controllable number of adversarial errors. Experiments with NVIDIA Tesla K80 and Tesla P100 Accelerators demonstrate that the framework is practical for inputs of up to 512 vertices, with proof checking being several orders of magnitude more efficient than preparing the proof; however, proof preparation still carries at least one order of magnitude overhead compared with just solving the problem.

show abstract

“…When q = 2, we use F2 and {0, 1} interchangeably throughout the paper 13. In fact, error correcting codes, as well as constructing new codes out of existing codes by concatenations to be discussed shortly, can be defined more generally over an arbitrary set of q distinct elements called alphabet of the code.…”

mentioning

confidence: 99%

A new coding-based algorithm for finding closest pair of vectors

Xie

2019

Theoretical Computer Science

View full text Add to dashboard Cite

Given n vectors x 0 , x 1 , . . . , x n−1 in {0, 1} m , how to find two vectors whose pairwise Hamming distance is minimum? This problem is known as the Closest Pair Problem. If these vectors are generated uniformly at random except two of them are correlated with Pearson-correlation coefficient ρ, then the problem is called the Light Bulb Problem. In this work, we propose a novel coding-based scheme for the Closest Pair Problem. We design both randomized and deterministic algorithms, which achieve the bestknown running time when the length of input vectors m is small and the minimum distance is very small compared to m. Specifically, the running time of our randomized algorithm is O(n log 2 n·2 cm ·poly(m)) and the running time of our deterministic algorithm is O(n log n · 2 c m · poly(m)), where c and c are constants depending only on the (relative) distance of the closest pair. When applied to the Light Bulb Problem, our result yields state-of-the-art deterministic running time when the Pearson-correlation coefficient ρ is very large. Specifically, when ρ ≥ 0.9933, our deterministic algorithm runs faster than the previously best deterministic algorithm (Alman, SOSA 2019).We consider the following classic Closest Pair Problem: given n vectors x 0 , x 1 , . . . , x n−1 in {0, 1} m , how to find the two vectors with the minimum pairwise distance? Here the distance is the usual Hamming distance:Without loss of generality, we assume that d min = dist(x 0 , x 1 ) is the unique minimum distance and all other pairwise distances are greater than d min .The Closest Pair Problem is one of the most fundamental and well-studied problems in many science disciplines, having a wide spectrum of applications in computational finance, DNA detection, weather prediction, etc. For instance, the Closest Pair Problem has the following interesting application in bioinformatics. Scientists wish to find connections between Single Nucleotide Polymorphisms (SNPs) and phenotypic traits. SNPs are one of the most common types of genetic differences among people, with each SNP representing a variation in a single DNA block called nucleotide [22]. Screening for most correlated pairs of SNPs has been applied to study such connections [11,15,17,38]. As the number of SNPs in humans is estimated to be around 10 to 11 million, for problem size n of this size, any improvement in running time for solving the Closest Pair Problem would have huge impacts on genetics and computational biology [38].In theoretical computer science, the Closest Pair Problem has a long history in computational geometry, see e.g. [43] for a survey of many classic algorithms for the problem. The naive algorithm for the Closest Pair Problem takes O(mn 2 ) time. When the dimension m is a constant, either in the Euclidean space or p space, the classic divide-and-conquer based algorithm runs in O(n log n) time [14]. Rabin [42] combined the floor function with randomization to devise a linear time algorithm. In 1995, Khuller and Matias [31] simplified Rabin's algorithm to achieve ...

show abstract

A framework for practical parallel fast matrix multiplication

Cited by 50 publications

References 41 publications

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Engineering a Delegatable and Error-Tolerant Algorithm for Counting Small Subgraphs

A new coding-based algorithm for finding closest pair of vectors

Contact Info

Product

Resources

About

A framework for practical parallel fast matrix multiplication

Cited by 50 publications

References 41 publications

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Engineering a Delegatable and Error-Tolerant Algorithm for Counting Small Subgraphs

A new coding-based algorithm for finding closest pair of vectors

Contact Info

Product

Resources

About

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices