New network services such as the Internet of Things and edge computing are accelerating the increase in traffic volume, the number of connected devices, and the diversity of communication. Next generation carrier network infrastructure should be much more scalable and adaptive to rapid increase and divergence in network demand with much lower cost. A more virtualization-aware, flexible and inexpensive system based on general-purpose hardware is necessary to transform the traditional carrier network into a more adaptive, next generation network. In this paper, we propose an architecture for carrierscale packet processing that is based on interleaved 3 dimensional (3D)-stacked dynamic random access memory (DRAM) devices. The proposed architecture enhances memory access concurrency by leveraging vault-level parallelism and bank interleaving of 3D-stacked DRAM. The proposed architecture uses the hash-function-based distribution of memory requests to each set of vault and bank; a significant portion of the full carrier-scale tables. We introduce an analytical model of the proposed architecture for two traffic patterns; one with random memory request arrivals and one with bursty arrivals. By using the model, we calculate the performance of a typical Internet protocol routing application as a benchmark of carrier-scale packet processing wherein main memory accesses are inevitable. The evaluation shows that the proposed architecture achieves around 80 Gbps for carrier-scale packet processing involving both random and bursty request arrivals.
The first stage of the ATLAS Fast TracKer (FTK) is an ATCA-based input interface system, where hits from the entire silicon tracker are clustered and organized into overlapping η-φ trigger towers before being sent to the tracking engines. First, FTK Input Mezzanine cards receive hit data and perform clustering to reduce data volume. Then, the ATCA-based Data Formatter system will organize the trigger tower data, sharing data among boards over full mesh backplanes and optic fibers. The board and system level design concepts and implementation details, as well as the operation experiences from the FTK full-chain testing, will be presented.
Summary → + −@ pile-up ~60• The LHC after the 2013-2014 shutdown periods is expected to deliver an increased instantaneous luminosity, which will make more difficult to have efficient online selection of rare events due to increasing of pile-up (60 or more proton-proton collisions overlapping in the same bunch crossing).• Real time tracking information by Fast TracKer (FTK) system make possible new trigger selections, which are robust against pile-up.The first processing step of the FTK is the clustering of strip and pixel data that reduces due amount of data to be processed by downstream algorithm and provides an accurate estimate of the cluster centroid.• Clustering signifies identification of the group of contiguous hits from inner track detector.• FTK IM is the most upstream board of FTK system.• The board was developed to receive all ATLAS Inner Detector data read out at event rate of 100 kHz and to perform clustering.• Total 128 FTK IMs receive total ~400 S-LINKs (maximum 2.0 Gbps, 32 bits word 40 MHz)• FPGAs perform clustering of Inner Detector hit information • The clustered 32 bits data word are sent to FMC connector at 50 MHz over a 200 MHz DDR 8 bit bus.• FTK is an approved ATLAS trigger upgrade project. It provide all track information (P T > 1 GeV) for any event accepted by Level 1 trigger.• Provided track information is used by High Level Trigger (HLT) for efficient triggering. inputs• The clustering implementation is designed in three separate processing modules of the hit decoder module, the grid clustering module, the centroid calculation module, and grid-clustering engine module. Multiple clustering engines can work in parallel. Hit DecoderHit decoder decodes ATLAS format to a format useful for the clustering, and realigns incoming hits to pixel column sequence. Grid ClusteringGrid clustering uses a "moving window" technique to minimize computational time per cluster identification as well as needed FPGA resources. Centroid CalculationCentroid calculation calculate the centroid including a correction based on Time-over-threshold information, which is correlated to the charge collected by each pixel. Test Setup at CERNFTK IM S-LINK InputsReadout system FTK IM• Several prototype of FTK IM have been produced and tested.• The boards work well with fiber input at 2.0 Gbps per line and 200 MHz DDR output.• Data communication shows no error after 10 16 bits are transfer, which meets the requirement from the ATLAS experiment. Pixel Clustering Algorithm• Single and multi engine Clustering algorithm was tested on firmware simulation with expected data.• Single engine 2D Clustering algorithm work perfectly with real S-LINK inputs as tested at CERN.• The implementation of 16 parallel engines was prepared. It occupies 40 % of a FPGA resource. A 4 parallel engines implementation is sufficient for Pixel inputs. Studies for IBL inputs are in progress.•The parallelized 2D clustering implementation has a enough processing power for 80 pile-up that correspond to the maximum LHC luminosity planned until 2022.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.