Abstract-Random forest classification is a well known machine learning technique that generates classifiers in the form of an ensemble ("forest") of decision trees. The classification of an input sample is determined by the majority classification by the ensemble. Traditional random forest classifiers can be highly effective, but classification using a random forest is memory bound and not typically suitable for acceleration using FPGAs or GP-GPUs due to the need to traverse large, possibly irregular decision trees. Recent work at Lawrence Livermore National Laboratory has developed several variants of random forest classifiers, including the Compact Random Forest (CRF), that can generate decision trees more suitable for acceleration than traditional decision trees. Our paper compares and contrasts the effectiveness of FPGAs, GP-GPUs, and multi-core CPUs for accelerating classification using models generated by compact random forest machine learning classifiers.Taking advantage of training algorithms that can produce compact random forests composed of many, small trees rather than fewer, deep trees, we are able to regularize the forest such that the classification of any sample takes a deterministic amount of time. This optimization then allows us to execute the classifier in a pipelined or single-instruction multiple thread (SIMT) fashion. We show that FPGAs provide the highest performance solution, but require a multi-chip / multi-board system to execute even modest sized forests. GP-GPUs offer a more flexible solution with reasonably high performance that scales with forest size. Finally, multi-threading via OpenMP on a shared memory system was the simplest solution and provided near linear performance that scaled with core count, but was still significantly slower than the GP-GPU and FPGA.
The Hybrid Memory Cube is an early commercial product embodying attributes of future stacked DRAM architectures, namely large capacity, high bandwidth, on-package memory controller, and high speed serial interface. We study the performance and energy of a Gen2 HMC on data-centric workloads through a combination of emulation and execution on an HMC FPGA board. An in-house FPGA emulator has been used to obtain memory traces for a small collection of data-centric benchmarks. Our FPGA emulator is based on a 32-bit ARM processor and non-intrusively captures complete memory access traces at only 20X slowdown from real time. We have developed tools to run combined trace fragments from multiple benchmarks on the HMC board, giving a unique capability to characterize HMC performance and power usage under a data parallel workload. We find that the HMC's separate read and write channels are not well exploited by read-dominated data-centric workloads. Our benchmarks achieve between 66%-80% of peak bandwidth (80 GB/s for 32-byte packets with 50-50 read/write mix) on the HMC, suggesting that combined read/write channels might show higher utilization on these access patterns. Bandwidth scales linearly up to saturation with increased demand on highly concurrent application workloads with many independent memory requests. There is a corresponding increase in latency, ranging from 80 ns on an extremely light load to 130 ns at high bandwidth.
Crystal x-ray imaging is frequently used in inertial confinement fusion and laser-plasma interaction applications, as it has advantages compared to pinhole imaging, such as higher signal throughput, beer achievable spatial resolution and chromatic selection. However, currently used x-ray detectors are only able to obtain a single time resolved image per crystal. e dilation aided single-line-of-sight x-ray camera described here, designed for the National Ignition Facility (NIF) combines two recent diagnostic developments, the pulse dilation principle used in the dilation x-ray imager (DIXI) and a ns-scale multi-frame camera that uses a hold-andreadout circuit for each pixel (hCMOS). is enables multiple images to be taken from a single-line-of-sight with high spatial and temporal resolution. At the moment, the instrument can record two single-line-of-sight images with spatial and temporal resolution of 35 µm and down to 35 ps, respectively, with a planned upgrade doubling the number of images to four. Here we present the dilation aided single-line-of-sight camera for the NIF, including the x-ray characterization measurements obtained at the COMET laser and the results from the initial timing shot on the NIF.
A major upgrade has been implemented for the ns-gated laser entrance hole imager on the National Ignition Facility (NIF) to obtain high-quality data for Hohlraum physics study. In this upgrade, the single “Furi” hCMOS sensor (1024 × 448 pixel arrays with two-frame capability) is replaced with dual “Icarus” sensors (1024 × 512 pixel arrays with four-frame capability). Both types of sensors were developed by Sandia National Laboratories for high energy density physics experiments. With the new Icarus sensors, the new diagnostic provides twice the detection area with improved uniformity, wider temporal coverage, flexible timing setup, and greater sensitivity to soft x rays (<2 keV). These features, together with the fact that the diagnostic is radiation hardened and can be operated on the NIF for high neutron yield deuterium–triterium experiments, enable significantly greater return of data per experiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.