Ring Bus InterconnectsCore + Coherent Cache 4 Various other features in the coprocessor such as the debug features required to validate and debug the hardware will not be covered in this book.
Two key hardware features that dictate the performance of technical computing applications on Intel Xeon Phi are the vector processing unit and the instruction set implemented in this architecture. The vector processing unit (VPU) in Xeon Phi provides data parallelism at a very fine grain, working on 512 bits of 16 single-precision floats or 32-bit integers at a time. The VPU implements a novel instruction set architecture (ISA), with 218 new instructions compared with those implemented in the Xeon family of SIMD instruction sets. Xeon Phi Vector Microarchitecture Physically, the VPU is an extension to the P54C core and communicates with the core to execute the VPU ISA implemented in Xeon Phi. The VPU receives its instructions from the core arithmetic logic unit (ALU) and receives the data from the L1 cache by a dedicated 512-bit bus. The VPU has its own dependency logic and communicates with the core to stall when necessary. The VPU is fully pipelined and can execute most instructions with four-cycle latency and single-cycle throughput. It can read/write one vector per cycle from/to the vector register file or data cache. Each vector can contain 16 single-precision floats or 32-bit integer elements or eight 64-bit integer or double-precision floating point elements. The VPU can do one load and one operation in the same cycle. The VPU instructions are ternary operands with two sources and a destination (which can also act as a source for fused multiply-and-add instructions). This configuration provides approximately a 20-percent gain in performance over traditional binary-operand SIMD instructions. Owing to the simplified design, the VPU instructions cannot generate exceptions, but they can set MXCSR flags to indicate exception conditions. A VPU instruction is considered retired when the core sends it to the VPU. If an error happens, the VPU sets MXCSR flags for overflow, underflow, or other exceptions. Each VPU underneath consists of eight master ALUs, each containing two single-precision (SP) and one double-precision (DP) ALU with independent pipelines, thus allowing sixteen SP and eight DP vector operations. Each master ALU has access to a read-only memory (ROM) containing a lookup table for transcendental lookup, constants that the master ALU needs, and so forth. Each VPU has 128 entry 512-bit vector registers divided among the threads, thus providing 32 entries per thread. These are hard-partitioned. There are eight 16-bit mask registers per thread, which are part of the vector register file. The mask registers act as a filter per element for the 16 elements and thus allow you to control which of the 16 32-bit elements are active during a computation. For double precision the mask bits are the bottom eight bits. Most of the VPU instructions are issued from the core through the U-pipe. Some of the instructions can be issued from the V-pipe and can be paired to be executed at the same time with instructions in the U-pipe VPU instructions.
Developing an accurate and reliable time-averaged beach profile evolution model under normal and storm conditions is a challenging task. Over the last few decades, a number of beach deformation models have been developed under limited experimental conditions and uncertainties, and sometimes they required a long computation time. It is quite evident that a large amount of wave, current, sediment and beach profile data is available today. The present study leads to the development of a simple two-dimensional beach profile evolution model with on-offshore sand bar formation under non-storm and storm conditions based on the time-averaged suspended sediment concentration models of Jayaratne & Shibayama [2007] and Jayaratne et al. [2011]. These models were formulated for computing sediment concentration in and outside the surf zone under three different mechanisms: 1) suspension due to turbulent motion over sand ripples, 2) suspension from sheet flow layer and, 3) suspension due to turbulent motion under breaking waves. The suspended load is calculated by the product of time-averaged sediment concentration and undertow velocity from edge of the wave boundary layer to wave trough, and mass transport velocity from wave trough to crest (bore-like wave region). Sediment transport in wave boundary layer is computed from the modified Watanabe [1982] model. Rattanapitikon and Shibayama [1998] wave model is used to calculate the average rate of energy dissipation due to wave breaking. The beach deformation is calculated from the conservation of sediment mass while the avalanching concept of Larson and Kraus [1989] is used to re-distribute the sediment mass in neighbouring grids for a steady solution. Published field-scale experimental and natural beach profiles from 5 high-quality data sources from 1983-2009 [Kajima et al., 1983Kraus and Larson, 1988; Port and Airport Research Institute, Japan, 2005, 2009;Hasan & Takewaka, 2007, 2009Ruessink et al., 2007] are used to verify the performance of the proposed numerical model. The key feature in this process-based model is that it takes about a couple of minutes to simulate beach profiles of a 2-3 days storm qualitatively at a fairly satisfactory level using a standard personal computer. It is found that the present numerical predictions are not better than the null hypothesis as the model is in a stage of ongoing development. Therefore, it is believed that the final model is often more value to a practical coastal engineer than a very detailed study of hydrodynamics and sediment transport study, however an incorporation of swash dynamics, more precise evaluation of offshore sand bar formation and continuation to a longer time scale with precise beach deformation is recommended as the next stage of the model.Key words: Two-dimensional beach profile evolution model; time-averaged suspended sediment concentration; on-offshore sand bar formation; average rate of energy dissipation; conservation of sediment mass; avalanching concept; field-scale experimental and natural bea...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.