This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence; e.g., IEEE 754 compliant half-precision floating point (FP16) requires hyper-parameter tuning. In this paper, we discuss the flow of tensors and various key operations in mixed precision training, and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16. We have implemented a method to emulate BFLOAT16 operations in Tensorflow, Caffe2, IntelCaffe, and Neon for our experiments. Our results show that deep learning training using BFLOAT16 tensors achieves the same state-of-the-art (SOTA) results across domains as FP32 tensors in the same number of iterations and with no changes to hyper-parameters.
Abstract-Virtualization is one of the key aspects used in cloud computing environment to achieve scalability and flexibility. To cope with the large number of virtual machines (VM) involved in the cloud, several solutions have been proposed to automatically monitor and deploy VM in resource pools. Most of the cloud management system, such as Amazon EC2, are proprietary and are not generally available for research. In the said perspective, many open source VM-based cloud platforms launched for general users to research. The existing work performed in VM-based cloud management platforms have mainly focused on the discussion of architecture, feature set, and performance analysis. However, other important aspects, such as formal analysis, modeling, and verification are usually ignored. In this paper, we provide a formal analysis, modeling, and verification of three open source state-of-the-art VM-based cloud management platforms: 1) Eucalyptus, 2) Open Nebula, and 3) Nimbus. We have used high-level Petri nets (HLPN) to model and analyze the structural and behavioral properties of the systems. Moreover, to verify the models, we have used Satisfiability Modulo Theories Library (SMT-Lib) and Z3 Solver. We modeled about 100 VM to verify the correctness and feasibility of our models. The results reveal that the models are functioning correctly. Moreover, the increase in the number of VM does not affect the working of the models that indicates the practicability of the models in a highly scalable and flexible environment.
We show how to automatically verify that a complex XScale-like pipelined machine model is a WEB-refinement of an instruction set architecture model, which implies that the machines satisfy the same safety and liveness properties. Automation is achieved by reducing the WEB-refinement proof obligation to a formula in the logic of Counter arithmetic with Lambda expressions and Uninterpreted functions (CLU). We use UCLID to transform the resulting CLU formula into a CNF formula, which is then checked with a SAT solver. We define several XScale-like models with out of order completion, including models with precise exceptions, branch prediction, and interrupts. We use two types of refinement maps. In one, flushing is used to map pipelined machine states to instruction set architecture states; in the other, we use the commitment approach, which is the dual of flushing, since partially completed instructions are invalidated. We present experimental results for all the machines modeled, including verification times. For our application, we found that the SAT solver Siege provides superior performance over Chaff and that the amount of time spent proving liveness when using the commitment approach is less than 1% of the overall verification time, whereas when flushing is employed, the liveness proof accounts for about 10% of the verification time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.