2022
DOI: 10.3389/fams.2022.830270
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Jackknife Resampling for the Canonical Polyadic Decomposition

Abstract: The Canonical Polyadic (CP) tensor decomposition is frequently used as a model in applications in a variety of different fields. Using jackknife resampling to estimate parameter uncertainties is often desirable but results in an increase of the already high computational cost. Upon observation that the resampled tensors, though different, are nearly identical, we show that it is possible to extend the recently proposed Concurrent ALS (CALS) technique to a jackknife resampling scenario. This extension gives acc… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 28 publications
0
2
0
Order By: Relevance
“…The elements of the arrays used to perform the benchmark calculations were randomly generated. The larger standard deviation for smaller problem sizes and very short time measurements in the order of 10 –2 are often attributed to small changes in the workload of the operating system . Calculation time of 0.1% is used for host-to-device transfer; therefore, we do not consider PCIe a bottleneck.…”
Section: Numerical Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The elements of the arrays used to perform the benchmark calculations were randomly generated. The larger standard deviation for smaller problem sizes and very short time measurements in the order of 10 –2 are often attributed to small changes in the workload of the operating system . Calculation time of 0.1% is used for host-to-device transfer; therefore, we do not consider PCIe a bottleneck.…”
Section: Numerical Resultsmentioning
confidence: 99%
“…The larger standard deviation for smaller problem sizes and very short time measurements in the order of 10 –2 are often attributed to small changes in the workload of the operating system. 73 Calculation time of 0.1% is used for host-to-device transfer; therefore, we do not consider PCIe a bottleneck. All in all, we observe an order of magnitude reduction of computing time for the 3 different bottleneck tensor contractions encountered in the CCSD working equation when utilizing the CuPy implementation computed on the NVIDIA Tesla V100S PCIe 32GB (rev 1a) compared to our NumPy implementation computed on 36 CPU cores.…”
Section: Numerical Resultsmentioning
confidence: 99%