Zhangcheng Zheng scite author profile

Quantization is one of the key techniques used to make Neural Networks (NNs) faster and more energy efficient. However, current low precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values. This hidden cost limits the latency improvement realized by quantizing NNs. To address this, we present HAWQ-V3, a novel dyadic quantization framework. The contributions of HAWQ-V3 are the following. (i) The entire inference process consists of only integer multiplication, addition, and bit shifting in INT4/8 mixed precision, without any floating point operations/casting or even integer division. (ii) We pose the mixed-precision quantization as an integer linear programming problem, where the bit precision setting is computed to minimize model perturbation, while observing application specific constraints on memory footprint, latency, and BOPS. (iii) To verify our approach, we develop the first open source 4-bit mixed-precision quantization in TVM, and we directly deploy the quantized models to T4 GPUs using only the Turing Tensor Cores. We observe an average speed up of 1.45× for uniform 4bit, as compared to uniform 8-bit, precision for ResNet50. (iv) We extensively test the proposed dyadic quantization approach on multiple different NNs, including ResNet18/50 and InceptionV3, for various model compression levels with/without mixed precision. For instance, we achieve an accuracy of 78.50% with dyadic INT8 quantization, which is more than 4% higher than prior integer-only work for InceptionV3. Furthermore, we show that mixedprecision INT4/8 quantization can be used to achieve higher speed ups, as compared to INT8 inference, with minimal impact on accuracy. For example, for ResNet50 we can reduce INT8 latency by 23% with mixed precision and still achieve 76.73% accuracy. Our framework and the TVM implementation have been open sourced [1].

show abstract

Distance minimizing based data‐driven computational method for the finite deformation of hyperelastic materials

Zheng

Zhang

et al. 2023

Numerical Meth Engineering

View full text Add to dashboard Cite

The distance minimizing based data-driven solvers are developed for the finite deformation analysis of three-dimensional (3D) compressible and nearly incompressible hyperelastic materials in this work. The data-driven solvers bypass the construction of a constitutive equation for the hyperelastic materials by considering a dataset of Green-Lagrange strain-second Piola-Kirchhoff stress pairs. They recast the boundary-value problems into the distance minimization problems with basic kinematical and mechanical constraints. Moreover, the deviatoric/volumetric split of stress and the additional incompressible constraint are further introduced into the solver for the nearly incompressible hyperelastic materials. Several representative three-dimensional examples are presented and the results demonstrate the good capability and robustness of the proposed data-driven solvers.

show abstract

MULTI-LEVEL K-d TREE-BASED DATA-DRIVEN COMPUTATIONAL METHOD FOR THE DYNAMIC ANALYSIS OF MULTI-MATERIAL STRUCTURES

Zheng

Zhang

et al. 2020

Int J Mult Comp Eng

View full text Add to dashboard Cite

Distance Minimizing-Based Data-Driven Computational Plasticity Method with Fixed Dataset

Zheng

Zhang

et al. 2022

Int. J. Appl. Mechanics

View full text Add to dashboard Cite

A data-driven computational plasticity method based on the distance minimizing framework is proposed in this paper. In this method, the internal variables in conventional plasticity are abandoned and a fixed dataset considering path-dependent behaviors of materials is constructed. With the fixed dataset, a stress correspondence method is developed to compute the plastic strain of every integration point at each load step, and a data-driven classification model for yielding is constructed to rapidly determine the yield status of each point in the method. Moreover, a symmetric mapping method is developed to accurately determine the stress–strain state of the integration point under unloading or inverse loading conditions. Several representative examples are presented to show the capability of the proposed method. Numerical results of two- and three-dimensional truss structures and three-dimensional continuum bodies demonstrate the high efficiency and accuracy of the proposed data-driven computational plasticity method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.