A Survey of Brain-Inspired Learning in Artificial Neural Networks

2749cd83b27da435db7b29b7e9a6e932.png

来源:算法进阶‍
本文约15000字,建议阅读20+分钟本文对当前人工神经网络中大脑启发的学习表示进行了全面的综述。


35b8518000b9e58c1c5e7fd7ede20755.png

Brain-inspired learning in artificial neural networks: a review

Brain-Inspired Learning in Artificial Neural Networks: A Review (12000 words)

Artificial neural networks (ANNs) have emerged as an important tool in machine learning, achieving notable success in diverse domains, including image and speech generation, gaming, and robotics. However, there are fundamental differences between the operating mechanisms of artificial neural networks and those of biological brains, especially with regard to the learning process. This paper presents a comprehensive survey of brain-inspired learned representations in current artificial neural networks. We investigate the integration of more biologically meaningful mechanisms, such as synaptic plasticity, to enhance the capabilities of these networks. Furthermore, we delve into the potential advantages and challenges posed by this approach. Finally, we point to promising avenues for future research in this rapidly evolving field that will bring us closer to understanding the nature of intelligence.

Correspondence: [email protected]

d82b08550a19fbe376c9aa69cc29ac63.png

introduce

The dynamic interrelationship between memory and learning is a fundamental hallmark of intelligent biological systems. It enables organisms not only to assimilate new knowledge, but also to continually refine existing capabilities, allowing them to expertly respond to changing environmental conditions. This adaptive feature is associated with various timescales, including long-term learning and rapid short-term learning through mechanisms of short-term plasticity, highlighting the complexity and adaptability of biological nervous systems. The development of artificial systems that extract high-level inspiration from the brain is a long-standing scientific pursuit spanning decades. While earlier attempts met with limited success, the latest generation of artificial intelligence (AI) algorithms has achieved major breakthroughs in many challenging tasks. These tasks include, but are not limited to, the control of complex robotic systems that generate images and text based on cues provided by humans, and the mastery of strategy games such as chess and Go, and the multimodal fusion of these.

Although artificial neural networks have made remarkable progress in various fields, there are still significant limitations in the ability to control neural networks.         

Constantly learning and adapting like a biological brain. Unlike current models of machine intelligence, animals can learn throughout their lifespan, which is critical for stable adaptation to changing environments. This ability, known as lifelong learning, remains a major challenge for AI, which primarily optimizes problems consisting of fixed-label datasets, making it difficult to generalize to new tasks or to retain information across repeated learning iterations. Addressing this challenge is an active area of ​​research, and the potential impact of developing AI with lifelong learning capabilities could have profound implications across multiple fields.

In this article, we provide a unique review aimed at identifying the brain mechanisms that inspire current AI algorithms. To better understand the biological processes underlying natural intelligence, the first part will explore the low-level components that shape neuromodulation, from synaptic plasticity to its role in shaping the local and global dynamics of neural activity. This will be related to Artificial Neural Networks in Part III, where we compare and contrast artificial neural networks with biological nervous systems. This will give us a logical basis for trying to justify why the brain can provide AI with more than the inheritance of the current artificial model. Next, we will delve into artificial learning algorithms that simulate these processes to improve the capabilities of AI systems. Finally, we discuss various real-world applications of these AI techniques, emphasizing their potential impact on areas such as robotics, lifelong learning, and neuromorphic computing. By doing so, we aim to provide a comprehensive understanding of the interplay between biological brains and AI learning mechanisms, highlighting the potential benefits that this synergistic relationship may bring. We hope our findings will encourage a new generation of brain-inspired learning algorithms.

Supports the brain's learning process

A major effort in neuroscience aims to identify the deep processes of learning in the brain. Several mechanisms have been proposed to explain the biological basis of learning at different levels of granularity—from synaptic to population-level activity.

e5c61c9206b6be0c5a4fdf7f4e0cbdfa.pngFigure 1. Schematic representation of long-term potentiation (LTP) and depression (LTD) of synaptic biological neurons. A Synaptically connected presynaptic and postsynaptic neurons. B Synaptic terminal, the connection point between neurons. C Synapse growth (LTP) and synapse weakening (LTD). D Top, Membrane potential dynamics in neuronal axonal hillocks. Bottom, presynaptic and postsynaptic spikes. E Plotting experimentally recorded spike-time-dependent plasticity curves for LTP and LTD

However, the vast majority of biologically plausible modes of learning are characterized by plasticity that arises from interactions between local and global events. Below, we describe in more detail the various forms of plasticity and how these processes interact.

Synaptic Plasticity Synaptic plasticity in the brain refers to the ability of experience to alter the function of neural circuits. Synaptic plasticity refers specifically to changes in the strength of synaptic transmission based on activity, and is currently the most widely studied mechanism by which the brain adapts to new information. (Plasticity in the brain refers to the capacity of experience to modify the function of neural circuits. The plasticity of synapses specifically refers to the modification of the strength of synaptic transmission based on activity and is currently the most widely investigated mechanism by which the brain adapts to new information) There are two broad categories of synaptic plasticity: short-term and long-term plasticity. Short-term plasticity operates on a scale of tens of milliseconds to minutes, and plays an important role in short-term adaptation to sensory stimuli and short-term memory formation. Long-term plasticity operates over periods of minutes to longer and is thought to be one of the major processes involved in long-term behavioral change and memory storage.

Neuromodulation In addition to synaptic plasticity, another important mechanism by which the brain adapts to new information is neuromodulation. Neuromodulation refers to the regulation of nerve activity by chemical signaling molecules, often called neurotransmitters or hormones. These signaling molecules can alter the excitability of neural circuits and the strength of synapses, and have short- and long-term effects on neural function. Different types of neuromodulation have been identified, including acetylcholine, dopamine, and serotonin, which are involved in various functions such as attention, learning, and emotion. Neuromodulation has been suggested to play a role in various forms of plasticity, including short- and long-term plasticity.

Metaplasticity The ability of a neuron to change its function and structure in response to activity is a hallmark of synaptic plasticity. These changes that occur at synapses must be precisely organized so that they occur in the correct amount at the correct time. The regulation of this plasticity is called metaplasticity, or "synaptic plasticity," and plays a crucial role in protecting the changing brain from saturation. Essentially, alloplasticity alters the ability of synapses to generate plasticity by inducing changes in the physiological state of neurons or synapses. Superplasticity is thought to be a fundamental mechanism of memory stability, learning and regulation of neural excitability. Although similar, metagenesis can be distinguished from neuromodulation, with metaplastic and neuromodulatory events often overlapping in time during synaptic alterations.

Neurogenesis The process by which newly formed nerves integrate into existing neural circuits is called neurogenesis. Neurogenesis is most active during embryonic development but is also known to occur throughout adulthood, particularly in the subventricular zone amygdala of the lateral ventricles and the dentate gyrus of the hippocampal formation. In adult mice, neurogenesis was shown to increase when living in an enriched environment compared to standard laboratory conditions. In addition, many environmental factors such as exercise and stress have been shown to alter neurogenesis in the rodent hippocampus. Overall, although the role of neurogenesis in learning is not fully understood, it is thought to play an important role in supporting learning in the brain.

Glial cells Glial cells, or glia, play a crucial role in supporting learning and memory by modulating neurotransmitter signaling at synapses, the small gaps between neurons that release and receive neurotransmitters. Astrocytes are glial cells that release and reuptake neurotransmitters, and metabolize and detoxify them. This helps regulate the balance and availability of neurotransmitters in the brain, which are critical for normal brain function and learning. Microglia, another type of glial cell, also regulate neurotransmitter signaling and are involved in the repairing and regeneration of damaged tissue, which is important for learning and memory. In addition to repair and regulation, structural changes in synaptic strength require the involvement of different types of glial cells, the least pronounced of which come from astrocytes. However, despite their critical involvement, we do not fully understand the role of glial cells. Understanding the mechanisms by which glial cells support synaptic learning is an important area of ​​ongoing research.    

Deep Neural Networks and Plasticity

Artificial Neural Networks and Spiking Neural Networks  Artificial neural networks have played a vital role in machine learning over the past few decades. These networks have catalyzed tremendous progress in solving various challenging problems. Many of the most impressive achievements in artificial intelligence have been achieved using large artificial neural networks trained on vast amounts of data. While there have been many technological advances, many of AI's achievements can be explained by innovations in computing technology, such as large-scale GPU accelerators and the accessibility of data. Although the application of large-scale artificial neural networks has brought about significant innovations, many challenges remain. Some of the most pressing practical limitations of artificial neural networks are that they are not very efficient in terms of power consumption, and they are not very good at handling dynamic and noisy data. Furthermore, ANNs cannot learn data beyond their training period (e.g., during deployment), which are presented in an independent and identically distributed (IID) form without time, which cannot reflect the physical reality that information is highly correlated in time and space. These drawbacks have resulted in their application requiring large amounts of energy in large-scale settings38 and also pose challenges for integration into edge computing devices such as robots and wearables39.

Looking for solutions from neuroscience, researchers have been exploring spiking neural networks (SNNs) as an alternative to artificial neural networks. SNNs are a class of artificial neural networks whose design is closer to the behavior of biological neurons. The main difference between ANNs and SNNs is that SNNs incorporate the concept of time into their communication. Spiking neurons accumulate information from connected (presynaptic) neurons (or via sensory input) in the form of membrane potentials. Once a neuron's membrane potential exceeds a threshold, it sends a binary "spike" to all efferent (post-synaptic) connections. Although spiking is binary and temporally sparse, it has been theoretically shown to contain more information than rate-based representations of information such as in artificial neural networks. Furthermore, modeling studies have shown the advantages of SNNs, such as better energy efficiency, the ability to handle noisy and dynamic data, and the potential for more robust and fault-tolerant computation. These benefits are not only attributable to their increased biological plausibility, but also to the unique properties of spiking neural networks that differentiate them from traditional artificial neural networks. A simple working model of a leaky integral triggering neuron is described below:

1a72f6b312db43d07f75a8d1708af7a6.png

Despite these potential advantages, SNNs are still at an early stage of development, and there are several challenges that need to be addressed before they can be used more widely. One of the most pressing challenges is how to optimize the synaptic weights of these models, since traditional backpropagation-based ANN approaches fail due to discrete and sparse nonlinearities. Regardless of these challenges, there are indeed some works that push the limits of modern spike networks, such as the SpikeGPT model based on large spikes. Spike models are important to this review as they form the basis of many brain-inspired learning algorithms.         

Hebbian and spike-timing dependent plasticity. Hebbian and STDP (spike-timing dependent plasticity) are two important models of synaptic plasticity, which play an important role in the formation of neural circuits and behaviors. The Hebbian-edge learning rule, first proposed by Donald Hebb in 1949, postulates that synapses between neurons strengthen as they interact, such that activation of one neuron causes activation of another. On the other hand, STDP is a recently proposed model of synaptic plasticity that considers the precise timing of pre- and postsynaptic spikes to determine synaptic strengthening or weakening. It is widely accepted that STDP plays a key role in the formation and refinement of neural circuits during development and in the continuous adaptation of circuits to experience. In the following subsections, we provide an overview of the rationale for Hebb learning and STDP.

Hebbian learning Hebbian learning is based on the idea that if two neurons are active at the same time, then the synaptic strength between them should increase, and vice versa. According to Hebb, this increase occurs (causally) when one cell "repeatedly or persistently participates in firing" another cell. However, this principle is often interrelated, as in the famous adage "cells fire together, wires stay together" (attributed variously to sie grid lwel or Kara Schatz)‍

Hebbian learning is often used as an unsupervised learning algorithm, where the goal is to recognize patterns in input data without explicit feedback. An example of this process is Hopfield networks, where large binary patterns can be easily stored in fully connected recurrent networks by applying the Hebbian rule to (symmetric) weights. It can also be adapted to supervised learning algorithms, where the rules are modified to take into account the desired output of the network. In this case, the Hibern learning rule is combined with a teach signal indicating the correct output for a given input.

A simple Hebbian learning rule can be described mathematically by the following equation:        

d3ef28f7b8434351d35add65004adfcb.png

where ▲wij is the weight change between neuron I and neuron j, η is the learning rate, and the xi "activity" in neuron I is usually considered the neuron firing rate. This rule states that if two neurons are fired at the same time, their connection should be strengthened.

A potential disadvantage of the basic Herbie rule is its instability. For example, if xi and xj are initially weakly positively correlated, this rule will increase the weight between the two, which will in turn strengthen the correlation, leading to an even larger weight increase, etc. Therefore, some form of stabilization is required. This can be achieved simply by limiting the weights, or by more complex rules that take into account additional factors such as the history of pre- and postsynaptic activity or the influence of other neurons in the network (see ref for a practical review of many such rules).

The Rule of Three: Hebbian Reinforcement Learning Hebbian learning can also be used in reinforcement learning by introducing information about rewards. An apparently feasible idea is to simply multiply the Hebbian update directly by the reward, as follows:

c18e836f9874b5b2ba790bf32501687e.png

R is the reward (either for this time step or for the entire episode). Unfortunately, this idea does not lead to reliable reinforcement learning. This can be intuitively felt by noting that if wij is already at its optimal value, the above rule still produces a net change, driving wij away from the optimal value.

More formally, as noted by Fremer et al., to correctly track the actual covariance between inputs, outputs, and rewards, at least one term in the xixjR product must be centered, that is, replaced by zero-mean fluctuations around its expected value. A possible solution is to calculate the return by subtracting from R a baseline that is usually equal to the expected value of R for this trial. While helpful, in practice this solution is often insufficient.

A more efficient solution would be to remove the mean from the output. This can be easily done by subjecting the neural activity xj to occasional random perturbations (taken from an appropriate zero-centered distribution), and then using the perturbation xj in the three-factor product instead of the original postsynaptic activity xj:

b635622f4e6248a0921c87c380c0ef41.png

This is the so-called "node perturbation" rule proposed by Fiete and Seung. Intuitively, the effect of paying attention to the xi xj delta is to push the future XJ response (when offsetting the same Xi input) in the direction of the perturbation: larger if the perturbation is positive, and smaller if the perturbation is negative. Multiplying this shift by R will push the future response towards the disturbance if R is positive, and away from it if R is negative. Even if R is not zero mean, the net effect (as expected) will still drive wij towards higher R, albeit with higher variance.

This rule implements the reinforcement algorithm (Williams' original paper actually proposes an algorithm that is an exact nodal perturbation for spiking stochastic neurons) that estimates the theoretical gradient of R on wij. It can also be implemented in a biologically feasible manner, allowing recurrent networks to learn non-trivial cognitive or motor tasks from sparse, delayed rewards.

Spike-timing dependent plasticity   (STDP) is a theoretical model of synaptic plasticity that allows the strength of connections between neurons to be modified according to the relative timing of their spike potentials. Unlike the Herbie learning rule, which relies on simultaneous activation of pre- and postsynaptic neurons, STDP takes into account the precise timing of pre- and postsynaptic spikes. Specifically, STDP argues that if a presynaptic neuron fires just before a postsynaptic neuron, the connection between them should be strengthened. Conversely, if the postsynaptic neuron fires just before the presynaptic neuron, the connection should be weakened.

STDP has been observed in a variety of biological systems, including the neocortex, hippocampus, and cerebellum. This rule has been shown to play a crucial role in the development and plasticity of neural circuits, including learning and memory processes. STDP has also been used as a basis for developing artificial neural networks, which are designed to mimic the structure and function of the brain.

The mathematical formulation of STDP is more complex than the Herbie learning rule, and can vary depending on the specific implementation. However, a common formula is:

Viewpoint, Brown and colleagues quote William James: "When two fundamental brain processes are active together or in succession, one of them, on repetition, tends to propagate its excitement to the other."

02814e759b4477f98ab1d5b048228998.png

3b3e4ac7d9da735fcea4364c81c10bd0.png

Figure 2. There are strong similarities between artificial and brain-like learning algorithms. Left, top, graphical depiction of a rodent and a population of interconnected neurons. In the middle, rodents are participating in the Morris water maze task to test their learning ability. The diagram below depicts biological presynaptic and postsynaptic pyramidal neurons. Right, top, Rodent musculoskeletal physical model with artificial neural network policy and critical head regulating learning and control (see references). In the middle, a virtual maze environment for benchmark learning algorithms (ref.). Bottom, an artificial presynaptic and postsynaptic neuron with a forward propagation equation

where wij is the weight change between neuron I and neuron j, t is the time difference between pre- and post-synaptic spikes, A+ and A are the amplitudes of enhancement and inhibition, respectively, and τ+ and τ are the time constants of enhancement and inhibition, respectively. The rule states that the strength of the connection between two neurons will increase or decrease depending on the timing of their spikes relative to each other.

Processes that support learning in artificial neural networks

There are two main approaches to weight optimization in artificial neural networks: error-driven global learning and brain-inspired local learning. In the first approach, the network weights are modified by driving the global error to its minimum. This is achieved by assigning errors to each weight and synchronizing modifications between each weight. In contrast, brain-inspired local learning algorithms aim to learn in a more biologically consistent manner by modifying the weights of dynamical equations using locally available information. Both optimization methods have unique advantages and disadvantages. In the following sections, we discuss the most commonly used form of error-driven global learning, backpropagation, before diving into brain-inspired local algorithms. It is worth mentioning that these two approaches are not mutually exclusive and are often integrated in order to complement their respective strengths.     

Backpropagation. Backpropagation is a powerful error-driven global learning method that changes the weights of connections between neurons in a neural network to produce a desired target behavior. This is achieved by using a quantitative metric (objective function) that describes the quality of behavior given sensory information (e.g., visual input, written text, robot joint positions). The backpropagation algorithm consists of two stages: a forward pass and a backward pass. In the forward pass, the input is propagated through the network, and the output is computed. During the back pass, the error between the predicted output and the "true" output is computed, and the gradient of the loss function with respect to the network weights is computed by backpropagating the error through the network. These gradients are then used to update the weights of the network using an optimization algorithm such as stochastic gradient descent. This process is repeated for many iterations until the weights converge to a set of values ​​that minimize the loss function.

Let's look at a simple mathematical explanation of backpropagation. First, we define an expected loss function, which is a function of the network output and the true value:

34843f9d0a16b4d5f28f930b40d61b24.png

where y is the real output and y' is the network output. In this case we are minimizing the squared error, but any smooth and differentiable loss function can be optimized nicely. Next, we use the chain rule to calculate the loss relative to the network weights. Let wl be the weight between neuron I in layer l and neuron j in layer l+1, and let al be the activation of neuron I in layer l. The gradient of the loss with respect to the weights is then given by:    

d3118d8546eb5fe9253e0cf7fc15a88c.png

where α is the learning rate. By repeatedly computing gradients and updating weights, the network gradually learns to minimize the loss function and make more accurate predictions. In practice, gradient descent is often combined with methods that introduce momentum into the gradient estimate, which has been shown to significantly improve generalization.

The impressive achievements of backpropagation have prompted neuroscientists to investigate whether it can better understand learning in the brain. Although there is still debate about whether backpropagation mutations occur in the brain, it is clear that current backpropagation is not biologically plausible. Another theory suggests that complex feedback loops or the interaction of local activity with top-down signals (a third factor) may support a similar form of backpropagation learning.

Despite its impressive performance, there are fundamental algorithmic challenges that arise from repeatedly applying backpropagation to the network weights. One such challenge is a phenomenon known as catastrophic forgetting, in which neural networks forget previously learned information when trained on new data. This occurs when the network is fine-tuned on new data, or when the network is trained on a series of tasks without retaining knowledge learned from previous tasks. Catastrophic forgetting is a significant obstacle to developing neural networks capable of continuous learning from diverse and changing environments. Another challenge is that backpropagation requires backpropagating information through all layers of the network, which is computationally expensive and time-consuming, especially for very deep networks. This can limit the scalability of deep learning algorithms and make it difficult to train large models on limited computing resources. Nonetheless, backpropagation remains the most widely used and most successful algorithm for applications involving artificial neural networks         

Evolutionary and Genetic Algorithms  Another class of global learning algorithms that has received a lot of attention in recent years is evolutionary and genetic algorithms. Inspired by the process of natural selection, these algorithms, in the context of artificial neural networks, aim to optimize the weights of neural networks by simulating the evolutionary process. A set of neural networks is initialized with random weights in a genetic algorithm and each network is evaluated on a specific task or problem. Networks that perform better on this task are selected for propagation, whereby they produce offspring with slight changes in weights. This process, repeated over several generations, is similar to evolutionary algorithms, but uses a different approach to approximate stochastic gradients. This is achieved by perturbing the weights and updating the parameters in conjunction with the performance of the network objective function. This results in a more comprehensive search that propagates more efficiently than local search methods like back in finding the optimal solution.

An advantage of these algorithms is that they can efficiently search huge parameter spaces, making them suitable for problems with a large number of parameters or complex search spaces. Furthermore, they do not require a different objective function, which is useful in cases where the objective function is difficult to define or compute (e.g. spiking neural networks). However, these algorithms also have some disadvantages. A major limitation is the high computational cost required to evaluate and develop a large number of networks. Another challenge is that algorithms can get stuck in local optima or converge too quickly, leading to suboptimal solutions. Additionally, the use of random mutations can lead to instability and unpredictability in the learning process.

Regardless, evolutionary and genetic algorithms have shown promising results in various applications, especially when optimizing non-differentiable and non-trivial parameter spaces. Ongoing research is focused on improving the efficiency and scalability of these algorithms, as well as discovering when and where it makes sense to use these methods instead of gradient descent.

Brain-inspired representations for learning in artificial neural networks

Local Learning Algorithms  Unlike global learning algorithms (such as backpropagation), which require information to propagate through the entire network, local learning algorithms focus on updating synaptic weights based on local information from nearby or synaptically connected neurons. These approaches are often strongly inspired by biological synaptic plasticity. As we will see, by utilizing local learning algorithms, artificial neural networks can learn more efficiently and adapt to changing input distributions, making them more suitable for real-world applications. In this section, we review recent advances in brain-inspired local learning algorithms and their potential to improve the performance and robustness of artificial neural networks.

Backpropagation-derived local learning  Backpropagation-derived local learning algorithms are a class of algorithms that attempt to mimic the mathematical properties of backpropagation. Unlike traditional backpropagation algorithms, which involve backpropagating error signals through the entire network, backpropagation-derived local learning algorithms update synaptic weights based on local error gradients computed using backpropagation. This approach is computationally efficient and allows online learning, making it suitable for applications where training data is constantly arriving.

A prominent example of a backpropagation-derived local learning algorithm is the Feedback Alignment (FA e Feedback Alignment() algorithm). This replaces the weight transfer matrix used in backpropagation with a fixed random matrix, allowing the error signal to propagate from the direct connection, thus avoiding the need for backpropagating the error signal. A simple mathematical description of feedback alignment is as follows: Suppose wout is the weight matrix connecting the last layer of the network to the output, and win is the weight matrix connecting the input to the first layer. In feedback alignment, the error signal is propagated from the output to the input using a fixed random matrix B instead of wout Transpose. The weight update is then computed using the product of the input and the error signal, win = ηxz where x is the input, η is the learning rate, and z is the error signal backpropagated through the network, similar to traditional backpropagation.

bc1cbe3c652c4a8b573c39bef9476b26.png

Direct Feedback Alignment (DFA) simplifies the weight transfer chain compared to FA by directly connecting the output layer error to each hidden layer. Signed Symmetric (SS) algorithms are similar to FA except that the feedback weights share signs symmetrically. While FAs have shown impressive results on small datasets like MNIST and CIFAR, their performance on large datasets like ImageNet is often suboptimal. On the other hand, recent studies have shown that the SS algorithm is able to achieve comparable performance to the backpropagation algorithm even on large-scale datasets.

Eligibility Propagation (e-prop) extends the idea of ​​feedback calibration for spiking neural networks, combining the advantages of traditional error backpropagation and biologically plausible learning rules such as spike-timing-dependent plasticity (STDP). For each synapse, the e-prop algorithm computes and maintains an eligible trajectory.

0c01511267e07338bb62ea00fe043efe.png

The error of the output neuron, either by using symmetric feedback weights or by using fixed feedback weights, as in feedback calibration. A possible disadvantage of e-prop is that it requires a real-time error signal Lt at each point in time, since it only considers past events and is blind to future errors. In particular, it cannot learn neurons (including short-term adaptation) from delayed error signals beyond the individual time horizon, in contrast to reinforcement and node perturbation methods.

In the work of this reference. A study of the structure of neuronal signaling based on the normative theory of synaptic learning based on recent genetic discoveries. They propose that neurons communicate their contribution to learned outcomes to nearby neurons through cell-type-specific local neuromodulation, and that neuron-type diversity and neuron-type-specific local neuromodulation may be key pieces of the biological score-assignment puzzle. In this work, the authors develop a simplified computational model based on eligibility propagation to explore this theory and show that their model, including dopamine-like time differences and neuropeptide-like local regulatory signals, leads to improvements over previous methods such as e-prop and feedback calibration.

Generalization properties   The generalization properties technique in deep learning has made tremendous progress in understanding the generalization of its learning algorithms. A particularly useful finding is that flat minima tend to lead to better generalization. This means that, for a given perturbation ϵ in the parameter space (synaptic weight values), a more pronounced performance drop is observed around a narrower minimum. Learning algorithms that find flatter minima in parameter space ultimately lead to better generalization.

Recent work has explored the generalization properties exhibited by local learning rules derived by (brain-inspired) backpropagation. Compared with backpropagation through time, the local learning rules derived by backpropagation exhibit poorer and more variable generalization, which cannot be improved by scaling the step size since the gradient approximation is less consistent with the true gradient. While it may not be surprising that local approximations of optimization procedures have poorer generalization properties than their full counterparts, this work opens the door to asking new questions about what is the best way to design brain-inspired learning algorithms. This also raises the question whether backpropagation-derived local learning rules are worth exploring, since they fundamentally would exhibit subpar generalization.

In conclusion, while backpropagation-derived local learning rules emerge as a promising approach for designing brain-inspired learning algorithms, they have limitations that must be addressed. The poor generalizability of these algorithms highlights the need for further research to improve their performance and explore alternative brain-inspired learning rules. This also raises the question whether backpropagation-derived local learning rules are worth exploring, since they would essentially exhibit subpar generalization.      

Meta-optimized plasticity rules  Meta-optimized plasticity rules provide an effective balance between error-driven global learning and brain-inspired local learning. Meta-learning can be defined as the automatic search for a learning algorithm itself, rather than relying on ergonomics. In order to describe a learning algorithm, a search process is employed to find the algorithm. The idea of ​​meta-learning naturally extends to brain-inspired learning algorithms, such that the brain-inspired learning mechanism itself can be optimized, allowing the discovery of more efficient learning without manually tuning the rules. In the following sections, we discuss aspects of this research, starting with differentiable optimized synaptic plasticity rules.

Differentiable Plasticity   An example of this principle in the literature is Differentiable Plasticity, a framework that focuses on optimizing synaptic plasticity rules in neural networks via gradient descent. Among these rules, plasticity rules are described in such a way that the parameters governing their dynamics are differentiable, allowing backpropagation to be used for meta-optimization of plasticity rule parameters (e.g., the η term in the simple Herbie rule or the A+ term in the STDP rule). This allows the weights to dynamically precisely solve tasks that require weights to be optimized during execution time, which is known as lifetime learning.

Differentiable plasticity rules also enable differentiable optimization of neuromodulatory dynamics. This framework includes two main variants of neuromodulation: global neuromodulation, in which the direction and magnitude of weight changes are controlled by global parameters related to the network output, and retrospective neuromodulation, in which the effects of past activity are modulated by dopamine-like signals over short time windows. This was achieved using eligibility traces, which are used to track which synapses contributed to recent activity, and dopamine signaling mediates the conversion of these traces to actual plastic changes.

Methods involving differentiable plasticity have improved in a wide range of sequential associative tasks, familiarity detection and robot noise adaptation. This approach has also been used to optimize short-term plasticity rules which show improved performance in reinforcement and temporally supervised learning problems. While these methods show great promise, the different plasticity methods require a lot of memory because backpropagation is used to optimize multiple parameters per synapse over time. Practical progress in these methods may require parameter sharing or a more memory-efficient form of backpropagation.

Plasticity of Spiking Neurons   Recent advances in non-differentiable partial backpropagation through spiking neurons with alternative gradients have allowed the use of different plasticities to optimize plasticity rules in spiking neural networks. The power of this optimization paradigm is demonstrated by using differentiable spike-time-dependent plasticity rules to "learn to learn" on online one-shot continuous learning problems and online one-shot image category recognition problems. A similar approach is used to optimize the third factor signal using the gradient approximation of e-prop as the plasticity rule, introducing a meta-optimization form of e-prop. Recurrent neural networks tuned by evolution can also be used to learn rules for meta-optimization. Evolving neural units (enums) introduce gating structures that control how inputs are processed, stored, and dynamic parameters updated. This work demonstrates the evolution of individual neuronal soma and synaptic compartment models and shows that networks of neurons can learn to solve T-maze environmental tasks, independently discovering spike dynamics and reinforcement-type learning rules.

Plasticity in RNNs and Transformers   Independent of studies aimed at learning plasticity using update rules, Transformers have recently been shown to be good lifelong learners. The process of situational learning does not occur by updating synaptic weights, but purely in network activations. Like transforming bodies, this process can also happen in recurrent neural networks. Although contextual learning appears to be a distinct mechanism from synaptic plasticity, these processes have been shown to exhibit strong correlations. An exciting connection discussed in the literature is the recognition that parameter sharing among meta-learners often leads to interpretation of activations as weights. This suggests that while these models may have fixed weights, they exhibit the same ability to learn as models with plastic weights. Another connection is that self-attention in the transformation body includes both outer and inner products, which can be transformed into learned weight updates or even gradient descent.

Evolutionary and genetic meta-optimization   Much like differentiable plasticity, evolutionary and genetic algorithms have been used to optimize the parameters of plasticity rules for various applications, including: Adapting to limb injuries in robotic systems. Recent work has also proposed an automated method for discovering plasticity rules in biology based on the specific task being solved by optimizing the plasticity coefficients and plasticity rule equations using Cartesian genetic procedures. In these methods, the genetic or evolutionary optimization process is similar to the differential process such that it optimizes the plasticity parameters in the outer loop process, while the plasticity rule optimizes the reward in the inner loop process. These methods are attractive because they have a much lower memory footprint than differentiable methods, since they do not need to backpropagate the error over time. However, while memory efficient, they often require large amounts of data to achieve comparable performance to gradient-based methods.

Self-referential meta-learning   Synaptic plasticity has two levels of learning, meta-learner and discovered learning rules, and self-referential meta-learning extends this hierarchy. In plasticity methods, only a subset of network parameters (such as synaptic weights) are updated, while meta-learning update rules remain fixed after meta-optimization.

07f484ebfdded16dc9fbbfda8595b99f.png

Figure 3. Feed-forward neural networks compute the output for a given input by propagating the input information downstream. The precise value of the output is determined by the weights of the synaptic coefficients. To improve the output of a task given an input, synaptic weights are modified. The synaptic plasticity algorithm represents a computational model that mimics the brain's ability to strengthen or weaken synapses (connections between neurons) based on neuronal activity, thereby facilitating learning and memory formation. Three-factor plasticity refers to a model of synaptic plasticity in which changes in the strength of neural connections are determined by three factors: presynaptic activity, postsynaptic activity, and regulatory signals, facilitating a more nuanced and adaptive learning process. The feedback calibration algorithm is a learning technique in which artificial neural networks are trained using random, fixed feedback connections instead of symmetric weight matrices, demonstrating that successful learning can occur without precise backpropagation. Backpropagation is a basic algorithm in machine learning and artificial intelligence used to train neural networks by computing the gradient of the loss function with respect to the weights in the network

The self-referential architecture enables the neural network to modify all its parameters recursively. Therefore, the learner can also modify the meta-learner. This allows in principle arbitrary levels of learning, meta-learning, meta-meta-learning, etc. Some methods meta-learn parameter initialization for such systems. Finding this initialization still requires a hardwired meta-learner. In other works, networks modify themselves in ways that even eliminate such meta-learners. Sometimes the learning rules to be discovered have a structural search space restriction, which simplifies self-improvement where a gradient-based optimizer can discover itself or an evolutionary algorithm can optimize itself. Despite their differences, both synaptic plasticity and self-referential approaches aim at self-improvement and adaptation of neural networks.

Generalization of meta-optimal learning rules  The extent to which discovered learning rules generalize to a wide variety of tasks is an important open question—in particular, when should they replace human-derived general learning rules such as backpropagation? A particular observation that challenges these methods is that generalization becomes more difficult when the search space is large and there are few constraints on the learning mechanism. However, to correct for this, variable-sharing meta-learns flexible learning rules parameterized by a parameter-sharing recurrent neural network that locally exchanges information to implement a learning algorithm that generalizes the classification problem not seen during meta-optimization. Findings for reinforcement learning algorithms have shown similar consequences.         

Brain Inspired Learning Applications

Neuromorphic computing  Neuromorphic computing represents a paradigm shift in computing system design, with the goal of creating hardware that mimics the functional architecture of a biological brain. This approach aims to develop artificial neural networks that not only replicate the brain's ability to learn, but also its energy efficiency and inherent parallelism. Neuromorphic computer systems often incorporate specialized hardware, such as neuromorphic chips or memristive devices, to enable efficient execution of brain-inspired learning algorithms. These systems have the potential to greatly improve the performance of machine learning applications, especially in edge computing and real-time processing scenarios.

A key aspect of neuromorphic computing lies in the development of specialized hardware architectures that facilitate the implementation of spiking neural networks, which more closely resemble the information processing mechanisms of biological neurons. Neuromorphic systems operate on brain-inspired local learning principles, which enable them to achieve energy-efficient, low-latency processing, and robustness against noise, which are critical for real-world applications. The integration of brain-inspired learning technology with neuromorphic hardware is critical for the successful application of this technology.

In recent years, advances in neuromorphic computing have led to the development of various platforms, such as Intel's Loihi, IBM's TrueNorth, and SpiNNaker, which provide specialized hardware architectures for implementing SNNs and brain-inspired learning algorithms. These platforms provide the basis for further exploration of neural computing systems, enabling researchers to design, simulate and evaluate new neural network structures and learning rules. As neuromorphic computing continues to advance, it is expected to play a key role in the future of artificial intelligence, driving innovation and enabling the development of more efficient, versatile, and biologically plausible learning systems.

Brain-inspired learning in neural networks for robot learning   has the potential to overcome many of the challenges currently present in the field of robotics by enabling robots to learn and adapt to their environment in a more flexible manner. Traditional robotic systems rely on preprogrammed behaviors that are limited in their ability to adapt to changing conditions. In contrast, as we have shown in this review, neural networks can be trained to adapt to new situations by adjusting their internal parameters based on the data they receive.

Because of their natural relationship to robots, brain-inspired learning algorithms have a long history in robotics. To this end, synaptic plasticity rules have been introduced to accommodate changes in the robot's behavioral domains, such as locomotor gain and rough terrain as well as for obstacle avoidance and articulation (arm) control. Brain-inspired learning rules have also been used to explore how learning occurs in insect brains, using robotic systems as embodied vehicles.

Deep reinforcement learning (DRL) represents a major success of brain-inspired learning algorithms, combining the strengths of neural networks with the theory of reinforcement learning in the brain to create autonomous agents capable of learning complex behaviors by interacting with their environments. By exploiting a reward-driven learning process that mimics the activity of dopamine neurons as opposed to minimizing e.g. classification or regression errors, the DRL algorithm guides robots to learn optimal strategies to achieve their goals, even in highly dynamic and uncertain environments. This powerful approach has been demonstrated in various robotics applications, including dexterous manipulation, robotic locomotion, and multi-agent coordination.

Lifelong learning and online learning   Lifelong learning and online learning are important applications of brain-inspired learning in artificial intelligence because they enable systems to adapt to changing environments and continuously acquire new skills and knowledge. In contrast, traditional machine learning methods are usually trained on fixed datasets and lack the ability to adapt to new information or changing environments. The mature brain is an incredible medium for lifelong learning because it learns continuously throughout life while remaining relatively constant in size. As this review demonstrates, similar to the brain, neural networks endowed with brain-inspired learning mechanisms can be trained to continuously learn and adapt, improving their performance over time.

The development of brain-inspired learning algorithms that enable artificial systems to exhibit this ability has the potential to significantly enhance their performance and capabilities, with broad implications for a variety of applications. These applications are particularly useful in situations where data is scarce or expensive to collect, such as in the field of robotics or autonomous systems because it allows systems to learn and adapt in real time, rather than having to collect and process large amounts of data before learning occurs.

A major goal in the field of lifelong learning is to alleviate a major problem associated with the continuous application of backpropagation in artificial neural networks, a phenomenon known as catastrophic forgetting. Catastrophic forgetting refers to the tendency of artificial neural networks to suddenly forget previously learned information when learning new data. This happens because weights in networks originally optimized for earlier tasks are overhauled to accommodate new learning, erasing or overwriting previous information. This is because the backpropagation algorithm, while facilitating new learning, does not inherently take into account the need to preserve previously acquired information. Solving this problem has been a major hurdle in the field of artificial intelligence for decades. We hypothesize that by using brain-inspired learning algorithms that mimic the dynamic learning mechanisms of the brain, we may be able to exploit the skilled problem-solving strategies inherent in biological organisms.

Understanding the Brain   The worlds of artificial intelligence and neuroscience have benefited greatly from each other. Deep neural networks tailored for certain tasks have striking similarities to the human brain in the way they process spatial and visual information. This overlap hints at the potential of artificial neural networks (ANNs) as useful models to help us better understand the complex mechanisms of the brain. A new movement called the Connectionist Research Initiative embodies this combined approach, using artificial neural networks as the language of computation to form and test ideas about how the brain computes. This view brings together disparate research efforts to provide a general computational framework and tools to test specific theories about the brain.

While this review highlights a range of algorithms that mimic brain function, we still have a lot of work to do to fully grasp how learning actually happens in the brain. Training large neural networks using backpropagation and backpropagation-like local learning rules can provide a good starting point for modeling brain function. A great deal of fruitful research has been done to understand which processes in the brain operate similarly to backpropagation, leading to new ideas and theories in neuroscience. Although backpropagation in its current form may not occur in the brain, the idea that the brain might develop an internal representation similar to that of an artificial neural network despite such different learning mechanisms is an exciting open problem that could lead to a deeper understanding of the brain and artificial intelligence.

Exploration is now expanding beyond static network dynamics to uncovering temporal functions of networks, like the brain. As we further develop algorithms in continual and lifelong learning, it may become clear that our models need to more closely mirror learning mechanisms observed in nature. This shift in focus requires the integration of local learning rules—those that reflect the brain's own methods—into artificial neural networks.

We are convinced that the adoption of more biologically realistic learning rules in artificial neural networks will not only yield the aforementioned benefits, but also point neuroscience researchers in the right direction.. In other words, this is a strategy with double benefits: not only does it promise to inspire innovations in engineering, but it also brings us closer to unraveling the intricate processes in the brain. Armed with more realistic models, we can more deeply explore the computational complexity of the brain from a new perspective on artificial intelligence.         

in conclusion

In this review, we investigate the integration of more biologically plausible learning mechanisms into artificial neural networks. This further integration is an important step for both neuroscience and artificial intelligence. This is especially relevant to the tremendous advances in AI for large language models and embedded systems, which urgently require more energy-efficient learning and execution methods. Furthermore, while artificial neural networks have made great progress in these applications, their ability to adapt like biological brains remains highly limited, which we believe is a major application of brain-inspired learning mechanisms.

As we strategize for future collaborations between neuroscience and AI on more detailed brain-inspired learning algorithms, it is important to acknowledge that neuroscience's past influence on AI has rarely been about the direct application of off-the-shelf solutions to machines. More commonly, neuroscience stimulates AI researchers by asking interesting, algorithmic-level questions about aspects of animal learning and intelligence. It provides initial guidance on important mechanisms that support learning. Our point is that by leveraging insights from neuroscience, we can greatly accelerate the progress of learning mechanisms used in artificial neural networks. Likewise, experiments using brain-inspired learning algorithms in artificial intelligence could accelerate our understanding of neuroscience.

thank you

We thank the OpenBioML collaborative workspace through which several authors of this book are connected. This material is based on a grant from the National Science Foundation Graduate Research Grant DGE2139757.

literature

1. Time scales of motor learning and development. Psychological Review 108, 57 (2001).

2. "Active-silent" working memory in the prefrontal cortex: a dynamic encoding framework. Trends in Cognitive Science 19, 394–405 (2015).

3. Gerstner, W., Lehmann, M., Liakoni, V., Corneil, D. & Brea, J. Traces of entitlement and plasticity on behavioral timescales: Experimental support for the new Hebrew three-factor learning rule. Frontiers in Neural Circuits 12, 53 (2018).

4. Pre-trained language models for scientific texts. arXiv preprint arXiv:1903.10676 (2019).

5. Language models are one-shot learners. Advances in Neural Information Processing Systems 33, 1877–1901 (2020).

Due to limited space, references 6-149 are omitted here

Original paper link:

https://arxiv.org/pdf/2305.11252.pdf

Editor: Huang Jiyan

Proofreading: Wang Yuqing

90611808657687407f07f95a71780e72.png

Guess you like

Origin blog.csdn.net/tMb8Z9Vdm66wH68VX1/article/details/131862242