Synaptic plasticity is a long-lasting core hypothesis of brain learning that suggests local adaptation between two connecting neurons and forms the foundation of machine learning. The main complexity of synaptic plasticity is that synapses and dendrites connect neurons in series and existing experiments cannot pinpoint the significant imprinted adaptation location. We showed efficient backpropagation and Hebbian learning on dendritic trees, inspired by experimental-based evidence, for sub-dendritic adaptation and its nonlinear amplification. It has proven to achieve success rates approaching unity for handwritten digits recognition, indicating realization of deep learning even by a single dendrite or neuron. Additionally, dendritic amplification practically generates an exponential number of input crosses, higher-order interactions, with the number of inputs, which enhance success rates. However, direct implementation of a large number of the cross weights and their exhaustive manipulation independently is beyond existing and anticipated computational power. Hence, a new type of nonlinear adaptive dendritic hardware for imitating dendritic learning and estimating the computational capability of the brain must be built.
Power-law scaling, a central concept in critical phenomena, is found to be useful in deep learning, where optimized test errors on handwritten digit examples converge as a power-law to zero with database size. For rapid decision making with one training epoch, each example is presented only once to the trained network, the power-law exponent increased with the number of hidden layers. For the largest dataset, the obtained test error was estimated to be in the proximity of state-of-the-art algorithms for large epoch numbers. Power-law scaling assists with key challenges found in current artificial intelligence applications and facilitates an a priori dataset size estimation to achieve a desired test accuracy. It establishes a benchmark for measuring training complexity and a quantitative hierarchy of machine learning tasks and algorithms.
Attempting to imitate the brain's functionalities, researchers have bridged between neuroscience and artificial intelligence for decades; however, experimental neuroscience has not directly advanced the field of machine learning (ML). Here, using neuronal cultures, we demonstrate that increased training frequency accelerates the neuronal adaptation processes. This mechanism was implemented on artificial neural networks, where a local learning step-size increases for coherent consecutive learning steps, and tested on a simple dataset of handwritten digits, MNIST. Based on our on-line learning results with a few handwriting examples, success rates for brain-inspired algorithms substantially outperform the commonly used ML algorithms. We speculate this emerging bridge from slow brain function to ML will promote ultrafast decision making under limited examples, which is the reality in many aspects of human activity, robotic control, and network optimization. IntroductionMachine learning is based on Donald Hebb's pioneering work; seventy years ago, he suggested that learning occurs in the brain through synaptic (link) strength modifications (1). A synaptic strength modification typically lasts tens of minutes (2) while the clock speed of a neuron (node) ranges around one second (3). Although the brain is comparatively slow, its computational capabilities outperform typical state-of-the-art artificial intelligence algorithms. Following this speed/capability paradox, we experimentally derive accelerated learning mechanisms based on small datasets, where their utilization on gigahertz processors is expected to lead to ultrafast decision making.Unlike modern computers, a well-defined global clock does not govern brain dynamics; instead, they are a function of relative event timing (e.g., stimulations and evoked spikes) (4).According to neuronal computational, using decaying input summation via its ramified dendritic trees, each neuron sums the asynchronous incoming electrical signals and generates a short electrical pulse (spike) when its threshold is reached. For each neuron, synaptic strength is slowly modified based on the relative timing of inputs from other synapses; if a signal is induced from a synapse without generating a spike, its associated strength is modified based on the relative timing to adjacent spikes from other synapses on the same neuron (5).Recently it was experimentally demonstrated that each neuron functions as a collection of independent threshold units (6). After signals arrive via one of the dendritic trees, each threshold unit is activated. Additionally, a new type of adaptive rule was experimentally observed based on dendritic signal arrival timing (7), which is similar to the slow adaptation mechanism currently attributed to synapses (links). This dendritic adaptation occurs on a faster timescale: it requires approximately five minutes, while synaptic modification requires tens of minutes or more. ResultsIn this study, dendritic adaptation was experimentally examined at a higher stimulation fr...
The realization of complex classification tasks requires training of deep learning (DL) architectures consisting of tens or even hundreds of convolutional and fully connected hidden layers, which is far from the reality of the human brain. According to the DL rationale, the first convolutional layer reveals localized patterns in the input and large-scale patterns in the following layers, until it reliably characterizes a class of inputs. Here, we demonstrate that with a fixed ratio between the depths of the first and second convolutional layers, the error rates of the generalized shallow LeNet architecture, consisting of only five layers, decay as a power law with the number of filters in the first convolutional layer. The extrapolation of this power law indicates that the generalized LeNet can achieve small error rates that were previously obtained for the CIFAR-10 database using DL architectures. A power law with a similar exponent also characterizes the generalized VGG-16 architecture. However, this results in a significantly increased number of operations required to achieve a given error rate with respect to LeNet. This power law phenomenon governs various generalized LeNet and VGG-16 architectures, hinting at its universal behavior and suggesting a quantitative hierarchical time–space complexity among machine learning architectures. Additionally, the conservation law along the convolutional layers, which is the square-root of their size times their depth, is found to asymptotically minimize error rates. The efficient shallow learning that is demonstrated in this study calls for further quantitative examination using various databases and architectures and its accelerated implementation using future dedicated hardware developments.
Real-time sequence identification is a core use-case of artificial neural networks (ANNs), ranging from recognizing temporal events to identifying verification codes. Existing methods apply recurrent neural networks, which suffer from training difficulties; however, performing this function without feedback loops remains a challenge. Here, we present an experimental neuronal long-term plasticity mechanism for high-precision feedforward sequence identification networks (ID-nets) without feedback loops, wherein input objects have a given order and timing. This mechanism temporarily silences neurons following their recent spiking activity. Therefore, transitory objects act on different dynamically created feedforward sub-networks. ID-nets are demonstrated to reliably identify 10 handwritten digit sequences, and are generalized to deep convolutional ANNs with continuous activation nodes trained on image sequences. Counterintuitively, their classification performance, even with a limited number of training examples, is high for sequences but low for individual objects. ID-nets are also implemented for writer-dependent recognition, and suggested as a cryptographic tool for encrypted authentication. The presented mechanism opens new horizons for advanced ANN algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.