Short floating-point representation for convolutional neural network inference

Kang, Hyeong-Ju

doi:10.1587/elex.15.20180909

Cited by 12 publications

(10 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…HOBFLOPS generates efficient software emulation parallel FP arithmetic units optimized using hardware synthesis tools. HOBFLOPS investigates reduced complexity FP, [19,17] that is more efficient than fixed-point [21]. HOBFLOPS considers alternative FP formats [23] and register packing with bit-sliced arithmetic [24].…”

Section: Approachmentioning

confidence: 99%

HOBFLOPS for CNNs: Hardware Optimized Bitslice-Parallel Floating-Point Operations for Convolutional Neural Networks

Garland

Gregg

2021

Preprint

View full text Add to dashboard Cite

Low-precision floating-point (FP) can be highly effective for convolutional neural network (CNN) inference. Custom low-precision FP can be implemented in field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) accelerators, but existing microprocessors do not generally support fast, custom precision FP. We propose hardware optimized bitslice-parallel floating-point operators (HOBFLOPS), a generator of efficient custom precision emulated bitslice-parallel software(C/C++) FP arithmetic. We generate custom-precision FP routines, optimized using a hardware synthesis design flow, to create circuits. We provide standard cell libraries matching the bitwise operations on the target microprocessor architecture and a code generator to translate the hardware circuits to bitslice software equivalents. We exploit bitslice parallelism to create a novel, very wide (32—512 element) vectorized CNN convolution for inference. On Arm and Intel processors, the multiply-accumulate (MAC) performance in CNN convolution of HOBFLOPS, Flexfloat, and Berkeley’s SoftFP are compared. HOBFLOPS outperforms Flexfloat by up to 10× on Intel AVX512. HOBFLOPS offers arbitrary-precision FP with custom range and precision, e . g ., HOBFLOPS9, which outperforms Flexfloat 9-bit on Arm Neon by 7×. HOBFLOPS allows researchers to prototype different levels of custom FP precision in the arithmetic of software CNN ac celerators. Furthermore, HOBFLOPS fast custom-precision FP CNNs may be valuable in cases where memory bandwidth is limited.

show abstract

Section: Approachmentioning

confidence: 99%

HOBFLOPS for CNNs: Hardware Optimized Bitslice-Parallel Floating-Point Operations for Convolutional Neural Networks

Garland

Gregg

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Due to their excellent performance, deep learning methods have gained great interests and are successfully introduced in many fields [20][21][22][23][24]. Stacked autoencoders (SAE) [25] and deep belief network (DBN) [26] are employed to learn features directly from original signals for fault diagnosis of analog circuits.…”

Section: Introductionmentioning

confidence: 99%

Incipient fault diagnosis of analog circuits based on wavelet transform and improved deep convolutional neural network

Yang

Wang

Nie

et al. 2021

IEICE Electron. Express

View full text Add to dashboard Cite

To enhance the reliability of analog circuits in electrical systems, this letter proposes a novel incipient fault diagnosis method by integrating wavelet transform(WT) and improved convolutional neural network. Different from traditional methods, where feature extraction and classification are separately designed and performed, this letter aims to automatically learn fault features and classify the type of faults simultaneously. An improved convolutional neural network named multi-channel compactness convolutional neural network (MC-CNN) is proposed,which can obtain complementary and rich diagnosis information from multi-scale components extracted by wavelet transform. Moreover, we adopt center loss as an auxiliary loss function to maximize the interclass separability and intraclass compactness of samples. The proposed method is fully evaluated with the Sallen-Key bandpass filter circuit and the fouropamp biquad high-pass filter circuit. The experimental results demonstrate that the proposed method is very effective in feature extraction for fault diagnosis, and has higher diagnosis accuracy than other typical fault diagnosis methods.

show abstract

“…The RUL estimation based on the physical failure model is ineffective when dealing with large and complex nonlinear multi-operation equipment systems, while the data-driven method is the leading research direction of RUL estimation in recent years [4]. These data-driven methods involve support vector machines [5,6,7], neural networks [8,9,10] and particle filters(PF) [11,12,13,14,15]. When dealing with nonlinear non-Gaussian noise systems, the PF technique has excellent performance [16].…”

Section: Introductionmentioning

confidence: 99%

A remaining useful life estimation method for solenoid valve based on mmWave radar and auxiliary particle filter technique

Liu

et al. 2021

IEICE Electron. Express

View full text Add to dashboard Cite

Solenoid valves(SVs) as a kernel component are widely used in various control systems. Through accurately predicting the remaining useful life (RUL) of the SV, predictable maintenance its can be implemented to reduce system maintenance costs and system reliability improves. This paper proposes a frequency modulated continuous wave (FMCW) based on millimeter-wave radar and the auxiliary particle filter (APF) technique to estimate the RUL of SVs. Firstly, a 77GHz FMCW millimeter-wave radar with a single input and multiple outputs(SIMO) is used to obtain the core displacements of two SVs with the same frequencies. A degradation experiment is designed, and the distortion degree of the core displacement is taken as the degradation indicator to construct a degradation data set. Secondly, the APF technique is used to estimate the degradation state and RUL of the SV. Finally, an experiment platform is built using TI's millimeter-wave radar to verify the method proposed in this paper. The experimental results show that the RUL estimation results are satisfactory.

show abstract

Short floating-point representation for convolutional neural network inference

Cited by 12 publications

References 15 publications

HOBFLOPS for CNNs: Hardware Optimized Bitslice-Parallel Floating-Point Operations for Convolutional Neural Networks

HOBFLOPS for CNNs: Hardware Optimized Bitslice-Parallel Floating-Point Operations for Convolutional Neural Networks

Incipient fault diagnosis of analog circuits based on wavelet transform and improved deep convolutional neural network

A remaining useful life estimation method for solenoid valve based on mmWave radar and auxiliary particle filter technique

Contact Info

Product

Resources

About