The logic of thinking. Part 3: Perceptron, convolutional network





In the first part of we describe the properties of neurons. second talked about the basic properties related to their training. In the next section, we turn to the description of how the real brain. But before that, we need to make a final effort and accept a little theory. Now it seems most likely is not particularly interesting. I think I myself would zaminusovali a training post. But all this "alphabet" greatly help us to understand the future.

Perceptron
In machine learning share two basic approaches: supervised learning and unsupervised learning. The methods described above highlight the main component - is learning without a teacher. The neural network receives no explanation for what she is fed to the input. It simply highlights those statistical regularities that are present in the input data stream. In contrast, the supervised learning assumes that part of the input images, called the training sample, we know what the output result we want. Accordingly, the problem - so configured neural network to capture the patterns that connect the input and output data.

In 1958, Frank Rosenblatt described the structure he called perceptron (Rosenblatt, 1958), which is capable of learning with the teacher (see KDPV).

Rosenblatt perceptron consists of three layers of neurons. First layer - this sensor elements that define what we have at the inlet. Second layer - associative elements. Their relationship with the touch layer are hard coded and define the transition to more general than the touch layer associative picture description.

Perceptron training is carried out by changing the weights of the neurons of the third layer of the reacting. Learning objective - to make the perceptron correctly classify the images submitted.

The neurons of the third layer act as thresholds adders. Accordingly, the weight of each of these parameters define a certain hyperplane. If there is a linearly separable input signals, the output neurons just can act as their classifiers.

If - is a vector of real output and perceptron - vector, which we expect to receive, on the performance of the neural network says error vector:

If you ask for, to minimize the mean square error, it is possible to derive the so-called delta-rule modifications weights:

In this case, the initial approximation may make zero weight.
This rule is no more than a rule of Hebb, applied to the case of the perceptron.
If the weekend is another layer or multiple layers and reacting to abandon the associative layer, which was introduced by Rosenblatt more biological certainty, than because of computational necessity, then we obtain a multilayer perceptron, as shown in the figure below.



A multi-layer perceptron with two hidden layers (Haykin, 2006) i>

If neurons reacting layers were simple linear combiner is little sense in such a complication would not be. Yield, regardless of the number of hidden layers, still remained to a linear combination of the input signals. But as in the hidden layers are used threshold adders, then each new layer breaks the chain of linear and can carry its interesting description.

For a long time it was not clear how it is possible to train a multilayer perceptron. The basic method - Backpropagation was described only in 1974, AI Galushkin and independently and simultaneously J. Paul. Verbosom. Then it was rediscovered and became widely known in 1986 (David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, 1986).

The method consists of two passes: forward and reverse. With direct pass training signal is fed and calculated activity of all network nodes, including the activity of the output layer. Obtained by subtracting the activity of that required to obtain an error signal is determined. In the reverse pass, the error signal propagates in the opposite direction from the output to the input. Thus synaptic weights are adjusted to minimize the error. A detailed description of the method can be found in a variety of sources (eg, Haykin, 2006).

It is important to pay attention to the fact that in a multilayer perceptron information is processed from level to level. When each layer is assigned a set of characteristics inherent to the input signal. This creates a certain analogy with the way the information is converted between the zones of the cerebral cortex.

Convolutional network. Neocognitron
Comparison of multilayer perceptron and a real brain is very conditional. General - is that, rising from zone to zone in the cortex or from layer to layer perceptron, information is becoming more generalized description. However, the structure of the part of the cortex is much more complicated than the organization of neurons in layer perceptron. Studies of the visual system D. Hubel and Wiesel T. allowed a better understanding of the structure of the visual cortex and pushed to the use of this knowledge in neural networks. The main ideas that have been used - is the local areas of perception and division of neurons functions within a single layer.

Locality perception of us already know, it means that the neuron that receives information, the following is NOT for all input signal space, and only part of it. Earlier we said that this area is called tracking receptive field of the neuron.

The concept of receptive field requires a separate specification. Traditionally, the receptive field of the neuron is called the space of receptors that affect the operation of the neuron. Under the receptor neurons are understood directly perceive external signals. Represent a neural network consisting of two layers wherein the first layer - the receptor layer and the second layer - the neurons connected to receptors. For each neuron of the second layer, those receptors that have contact with him - it is his receptive field.

Now take a complex multi-layer network. The further we move away from the entrance, the harder it will indicate which receptors and the effect on the activity of neurons located deep. At some point it may be that for some neuron receptors that exist can be called its receptive field. In this situation, the receptive field of the neuron want to call only those neurons with which it has direct synaptic contact. To separate these concepts will be called input space receptors - the original receptive field. And then the space of neurons that interacts directly with the neuron - a local receptive field or simply receptive field, without further clarification.

Division of neurons to function associated with the detection of the primary visual cortex of the two main types of neurons. Simple (simple) neurons respond to a stimulus, located in a particular place of their original receptive field. Complex (complex) neurons are active in the stimulus, regardless of his position.

For example, the figure below shows the options that might look like the original picture of the sensitivity of the receptive fields of simple cells. Positive activate a neuron area, suppress the negative. Each neuron has a simple stimulus, the most suitable for him and, consequently, causes maximum activity. But it is important that this incentive tightly bound to the original position in the receptive field. The same stimulus, but shifted to the side, would not trigger a simple neuron.



Initial receptive fields of simple cells (J. Nicholls., Martin R., Wallace B., Fuchs P.) I>

Complex neurons also have their preferred stimulus, but they can learn this stimulus, regardless of its position on the original receptive field.

From these two ideas were born appropriate models of neural networks. The first such network created Kunihiko Fukushima. It is named cognitron. Later he created a more advanced network - neocognitron (Fukushima, 1980). Neocognitron - a construction of several layers. Each layer consists of simple (s) and complex © neurons.

A Simple neuron monitor their receptive field and recognizable images, which he trained. Simple neurons are collected into groups (plane). Within the same group of simple neurons tuned to the same stimulus, but each neuron watching their fragment of the receptive field. Together, they go through all the possible positions of the image (see Figure below). All simple neurons one plane have the same weight but different receptive fields. You can imagine a situation in a different way, that is one neuron, which is able to try on their way immediately to all the items of the original image. All this allows you to learn the same way regardless of its position.



receptive fields of simple cells that are configured to search for the selected pattern in different positions (Fukushima K., 2013) i>

Each complex neuron watching their plane and simple neurons triggered if enabled at least one of simple neurons in its plane (see Figure below). Activity simple neuron says that he learned the characteristic stimulus in a particular place, which is its receptive field. Activity complex neuron means that the same image generally met on a layer, followed by simple neurons.



The plane neocognitron i>

Each layer has its input after the input picture formed complex neurons of the preceding layer. From layer to layer occurs more a compilation of information that results in the recognition of specific images regardless of their location on the original picture and a transformation.

With regard to the image analysis means that detects the level of the first line at a certain angle, passing through the small receptive fields. It is able to detect all possible directions anywhere in the image. The next level detecting possible combinations elementary features, defining a complex shape. And so to the point, is not yet able to determine the desired image (see Figure below).



The recognition process in neocognitron i>

When used for handwriting recognition, this design is resistant to a method of writing. The success of recognition does not affect movement on the surface or rotation or deformation (tension or compression).

The most significant difference from the full mesh neocognitron multilayer perceptron - a much smaller number of weights used with the same number of neurons. As it happens due to the "trick", which allows to determine the neocognitron images regardless of their position. The plane of simple cells - is essentially one neuron weights which determine the convolution kernel. This kernel is applied to the previous layer, running it in all possible positions. Actually neurons each plane and ask for his ties coordinates of these positions. This leads to the fact that all neurons in a layer of simple cells shall ensure that if there are any in their receptive field image corresponding to the kernel. That is, if such an image found elsewhere in the input signal for that layer, this is detected by at least one simple neuron activity and cause the corresponding neuron complex. This trick allows you to find a distinctive image in any place, wherever he appeared. But we must remember that this is a trick, and it is not really correspond to the real work of the cortex.

Education neocognitron happens without a teacher. It corresponds to the previously described isolation procedure full set of factors. When the input neocognitron serves real images, neurons have no choice but to highlight components inherent in these images. So, if to input handwritten digits, the small receptive fields of simple neurons of the first layer will see the lines, angles, and conjugation. Dimensions areas of competition determine how many different factors can stand in each spatial region. Primarily allocated most significant components. For handwritten digits this will be the line at different angles. If you remain immune factors, it may continue to stand out and more complex elements.

From layer to layer retained the general principle of learning - stand factors specific to a plurality of input signals. Feeding handwritten digits on the first layer at a certain level, we obtain factors corresponding to these numbers. Each figure will be a combination of stable set of features that stand out as a separate factor. The last layer contains as many neurons neocognitron how it is supposed to detect. Activity of one neuron of this layer speaks of recognition of the corresponding image (shown below)



Recognition in neocognitron (Fukushima K., Neocognitron, 2007) i>

The video below provides a visual representation of the neocognitron.




Alternative learning without a teacher - this training with the teacher. Thus, in the example of the figures, we can not wait for the network itself will provide a statistically robust form, and tell her that for her figure is presented, and to require appropriate training. The most significant results in this training convolutional networks achieved Yann LeCun (Y. LeCun and Y. Bengio, 1995). He showed how to use the Backpropagation learning networks, the architecture of which, like the neocognitron, vaguely reminiscent of the structure of the cerebral cortex.



Network convolution for handwriting recognition (Y. LeCun and Y. Bengio, 1995) i>

On this, we assume that the minimum initial information recall and you can go to things more interesting and surprising.

Continued

References

Previous parts:
Part 1. Neuron
Part 2. Factors

Alex Redozubov (2014)

Source: habrahabr.ru/post/214317/