This article describes approaches to computing second-order derivatives with automatic differentiation (AD) based on the forward mode and the propagation of univariate Taylor series. Performance results are given that show the speedup possible with these techniques relative to existing approaches. We also describe a new source transformation AD module for computing second-order derivatives of C and Fortran codes and the underlying infrastructure used to create a language-independent translation tool.
Multithreadedor hybrid von Neumann/dataflow execution models have an advantage over the fine-grain dataflow model in that they significantly reduce the run time overhead incurred by matching.In thw paper, we look at two issues related to the evaluation of a coarse-grain dataflow model of execution.The first issue concerns the compilation into a coarsegrain code from a fine-grain one. In this study, the concept of coarse-grain code is captured by clusters which can be thought of se mini-dataflow graphs which execute strictly, deterministically and without blocking.We look at two bottom-up algorithms: the basic block and the dependence sets methods, to partition dataflow graphs into clusters. The second issue is the actual performance of the clusterbaaed execution se several architecture parameters are varied (e.g. number of processors, matching cost, network latency, etc.). From the extensive simulation data we evaluate (1) the potential speedup over the fine-grain execution and (2) the effects of the various architecture parameters on the coarse-grain execution time, allowing us to draw conclusions on their effectiveness.The results indicate that even with a simple bottom-up algorithm for generating clusters, cluster execution offers a good speedup over the fine-grain execution over a wide range of architectures.They also indicate that coarse-grain execution is scalable, tolerates network latency and high matching cost well; it can benefit from a higher output bandwidth of a processor and finally, a simple superscalar processor with the issue rate of two is sufficient to exploit the internal parallelism of a cluster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.