New chips for machine learning applications appear, they are tuned for a specific topology, being efficient by using highly parallel designs at the cost of high power or large complex devices. However, the computational demands of deep neural networks require flexible and efficient hardware architectures able to fit different applications, neural network types, number of inputs, outputs, layers, and units in each layer, making the migration from software to hardware easy. This paper describes novel hardware implementing any feedforward neural network (FFNN): multilayer perceptron, autoencoder, and logistic regression. The architecture admits an arbitrary input and output number, units in layers, and a number of layers. The hardware combines matrix algebra concepts with serial-parallel computation. It is based on a systolic ring of neural processing elements (NPE), only requiring as many NPEs as neuron units in the largest layer, no matter the number of layers. The use of resources grows linearly with the number of NPEs. This versatile architecture serves as an accelerator in real-time applications and its size does not affect the system clock frequency. Unlike most approaches, a single activation function block (AFB) for the whole FFNN is required. Performance, resource usage, and accuracy for several network topologies and activation functions are evaluated. The architecture reaches 550 MHz clock speed in a Virtex7 FPGA. The proposed implementation uses 18-bit fixed point achieving similar classification performance to a floating point approach. A reduced weight bit size does not affect the accuracy, allowing more weights in the same memory. Different FFNN for Iris and MNIST datasets were evaluated and, for a real-time application of abnormal cardiac detection, a ×256 acceleration was achieved. The proposed architecture can perform up to 1980 Giga operations per second (GOPS), implementing the multilayer FFNN of up to 3600 neurons per layer in a single chip. The architecture can be extended to bigger capacity devices or multi-chip by the simple NPE ring extension.INDEX TERMS Feedforward neural networks -FFNN, systolic hardware architecture, FPGA implementation, neural network acceleration, deep neural networks.
Sensors provide data which need to be processed after acquisition to remove noise and extract relevant information. When the sensor is a network node and acquired data are to be transmitted to other nodes (e.g., through Ethernet), the amount of generated data from multiple nodes can overload the communication channel. The reduction of generated data implies the possibility of lower hardware requirements and less power consumption for the hardware devices. This work proposes a filtering algorithm (LDSI—Less Data Same Information) which reduces the generated data from event-based sensors without loss of relevant information. It is a bioinspired filter, i.e., event data are processed using a structure resembling biological neuronal information processing. The filter is fully configurable, from a “transparent mode” to a very restrictive mode. Based on an analysis of configuration parameters, three main configurations are given: weak, medium and restrictive. Using data from a DVS event camera, results for a similarity detection algorithm show that event data can be reduced up to 30% while maintaining the same similarity index when compared to unfiltered data. Data reduction can reach 85% with a penalty of 15% in similarity index compared to the original data. An object tracking algorithm was also used to compare results of the proposed filter with other existing filter. The LDSI filter provides less error (4.86 ± 1.87) when compared to the background activity filter (5.01 ± 1.93). The algorithm was tested under a PC using pre-recorded datasets, and its FPGA implementation was also carried out. A Xilinx Virtex6 FPGA received data from a 128 × 128 DVS camera, applied the LDSI algorithm, created a AER dataflow and sent the data to the PC for data analysis and visualization. The FPGA could run at 177 MHz clock speed with a low resource usage (671 LUT and 40 Block RAM for the whole system), showing real time operation capabilities and very low resource usage. The results show that, using an adequate filter parameter tuning, the relevant information from the scene is kept while fewer events are generated (i.e., fewer generated data).
Event-based cameras are not common in industrial applications despite the fact that they can add multiple advantages for applications with moving objects. In comparison with frame-based cameras, the amount of generated data is very low while keeping the main information in the scene. For an industrial environment with interconnected systems, data reduction becomes very important to avoid network congestion and provide faster response time. However, the use of new sensors as event-based cameras is not common since they do not usually provide connectivity to industrial buses. This work develops a network node based on a Field Programmable Gate Array (FPGA), including data acquisition and tracking position for an event-based camera. It also includes spurious reduction and filtering algorithms while keeping the main features at the scene. The FPGA node also includes the stack of the network protocol to provide standard communication among other nodes. The powerlink IEEE 61158 industrial network is used to communicate the FPGA with a controller connected to a self-developed two-axis servo-controlled robot. The inverse kinematics model for the robot is included in the controller. To complete the system and provide a comparison, a traditional frame-based camera is also connected to the controller. Response time and robustness to lighting conditions are tested. Results show that, using the event-based camera, the robot can follow the object using fast image recognition achieving up to 85% percent data reduction providing an average of 99 ms faster position detection and less dispersion in position detection (4.96 mm vs. 17.74 mm in the Y-axis position, and 2.18 mm vs. 8.26 mm in the X-axis position) than the frame-based camera, showing that event-based cameras are more stable under light changes. Additionally, event-based cameras offer intrinsic advantages due to the low computational complexity required: small size, low power, reduced data and low cost. Thus, it is demonstrated how the development of new equipment and algorithms can be efficiently integrated into an industrial system, merging commercial industrial equipment with new devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.