Artificial neuron
An artificial neuron is a mathematical function conceived as a model of a biological neuron in a neural network. The artificial neuron is the elementary unit of an artificial neural network.
The design of the artificial neuron was inspired by biological neural circuitry. Its inputs are analogous to excitatory postsynaptic potentials and inhibitory postsynaptic potentials at neural dendrites, or. Its weights are analogous to synaptic weights, and its output is analogous to a neuron's action potential which is transmitted along its axon.
Usually, each input is separately weighted, and the sum is often added to a term known as a bias, before being passed through a nonlinear function known as an activation function. Depending on the task, these functions could have a sigmoid shape, but they may also take the form of other nonlinear functions, piecewise linear functions, or [|step functions]. They are also often monotonically increasing, continuous, differentiable, and bounded. Non-monotonic, unbounded, and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU-like activation functions on many tasks have also been recently explored. The threshold function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic.
The artificial neuron activation function should not be confused with a linear system's transfer function.
An artificial neuron may be referred to as a semi-linear unit, Nv neuron, binary neuron, linear threshold function, or McCulloch–Pitts 'neuron', depending on the structure used.
Simple artificial neurons, such as the McCulloch–Pitts model, are sometimes described as "caricature models", since they are intended to reflect one or more neurophysiological observations, but without regard to realism. Artificial neurons can also refer to artificial cells in [|neuromorphic engineering] that are similar to natural physical neurons.
Basic structure
For a given artificial neuron, let there be inputs with signals through and weights through. Usually, the input is assigned the value +1, which makes it a bias input with. This leaves only actual inputs to the neuron: to.The output of the -th neuron is:
where is the activation function.
The output is analogous to the axon of a biological neuron, and its value propagates to the input of the next layer, through a synapse. It may also exit the system, possibly as part of an output vector.
It has no learning process as such. Its activation function weights are calculated, and its threshold value is predetermined.
McCulloch–Pitts (MCP) neuron
An MCP neuron is a kind of restricted artificial neuron which operates in discrete time-steps. Each has zero or more inputs, and are written as. It has one output, written as. Each input can be either excitatory or inhibitory. The output can either be quiet or firing. An MCP neuron also has a threshold.In an MCP neural network, all the neurons operate in synchronous discrete time-steps of. At time, the output of the neuron is if the number of firing excitatory inputs is at least equal to the threshold, and no inhibitory inputs are firing; otherwise.
Each output can be the input to an arbitrary number of neurons, including itself. However, an output cannot connect more than once with a single neuron. Self-loops do not cause contradictions, since the network operates in synchronous discrete time-steps.
As a simple example, consider a single neuron with threshold 0, and a single inhibitory self-loop. Its output would oscillate between 0 and 1 at every step, acting as a "clock".
Any finite state machine can be simulated by a MCP neural network. Furnished with an infinite tape, MCP neural networks can simulate any Turing machine.
Biological models
Artificial neurons are designed to mimic aspects of their biological counterparts. However a significant performance gap exists between biological and artificial neural networks. In particular single biological neurons in the human brain with oscillating activation function capable of learning the XOR function have been discovered.- Dendrites – in biological neurons, dendrites act as the input vector. These dendrites allow the cell to receive signals from a large number of neighboring neurons. As in the above mathematical treatment, each dendrite is able to perform "multiplication" by that dendrite's "weight value." The multiplication is accomplished by increasing or decreasing the ratio of synaptic neurotransmitters to signal chemicals introduced into the dendrite in response to the synaptic neurotransmitter. A negative multiplication effect can be achieved by transmitting signal inhibitors along the dendrite in response to the reception of synaptic neurotransmitters.
- Soma – in biological neurons, the soma acts as the summation function, seen in the above mathematical description. As positive and negative signals arrive in the soma from the dendrites, the positive and negative ions are effectively added in summation, by simple virtue of being mixed together in the solution inside the cell's body.
- Axon – the axon gets its signal from the summation behavior which occurs inside the soma. The opening to the axon essentially samples the electrical potential of the solution inside the soma. Once the soma reaches a certain potential, the axon will transmit an all-in signal pulse down its length. In this regard, the axon behaves as the ability for us to connect our artificial neuron to other artificial neurons.
Encoding
Research has shown that unary coding is used in the neural circuits responsible for birdsong production. The use of unary in biological networks is presumably due to the inherent simplicity of the coding. Another contributing factor could be that unary coding provides a certain degree of error correction.Physical artificial cells
There is research and development into physical artificial neurons – organic and inorganic.For example, some artificial neurons can receive and release dopamine and communicate with natural rat muscle and brain cells, with potential for use in BCIs/prosthetics.
Low-power biocompatible memristors may enable construction of artificial neurons which function at voltages of biological action potentials and could be used to directly process biosensing signals, for neuromorphic computing and/or direct communication with biological neurons.
Organic neuromorphic circuits made out of polymers, coated with an ion-rich gel to enable a material to carry an electric charge like real neurons, have been built into a robot, enabling it to learn sensorimotorically within the real world, rather than via simulations or virtually. Moreover, artificial spiking neurons made of soft matter can operate in biologically relevant environments and enable the synergetic communication between the artificial and biological domains.
History
The first artificial neuron was the Threshold Logic Unit, or Linear Threshold Unit, first proposed by Warren McCulloch and Walter Pitts in 1943 in A logical calculus of the ideas immanent in nervous activity. The model was specifically targeted as a computational model of the "nerve net" in the brain. As an activation function, it employed a threshold, equivalent to using the Heaviside step function. Initially, only a simple model was considered, with binary inputs and outputs, some restrictions on the possible weights, and a more flexible threshold value. Since the beginning it was already noticed that any Boolean function could be implemented by networks of such devices, what is easily seen from the fact that one can implement the AND and OR functions, and use them in the disjunctive or the conjunctive normal form.Researchers also soon realized that cyclic networks, with feedbacks through neurons, could define dynamical systems with memory, but most of the research concentrated on strictly feed-forward networks because of the smaller difficulty they present.
One important and pioneering artificial neural network that used the linear threshold function was the perceptron, developed by Frank Rosenblatt. This model already considered more flexible weight values in the neurons, and was used in machines with adaptive capabilities. The representation of the threshold values as a bias term was introduced by Bernard Widrow in 1960 – see ADALINE.
A further development was the Hebbian Learning Rule, proposed by Donald O. Hebb, which provided a fundamental rule for adjusting the weights in neural networks. The principle of Hebbian learning posits that the connection between two neurons strengthens if they activate simultaneously and weakens if they activate separately. A refinement of Hebbian learning, known as spike-timing-dependent plasticity, was developed to account for the precise timing of neuron spikes. This form of learning has been implemented in spiking neural networks, which are believed to be more energy-efficient than traditional ANNs and require less energy for transmission since they process data based on the occurrence of events rather than continuous computation.
In the late 1980s, when research on neural networks regained strength, neurons with more continuous shapes started to be considered. The possibility of differentiating the activation function allows the direct use of the gradient descent and other optimization algorithms for the adjustment of the weights. Neural networks also started to be used as a general function approximation model. The best known training algorithm called backpropagation has been rediscovered several times but its first development goes back to the work of Paul Werbos.