What is the deep neural network?
Machine learning and deep learning is a subset of artificial intelligence (AI). The deep neural network can be defined as the neural network that has two layers of neural network with some certain level of complexity. For processing data, these networks use mathematical modelling but in very complex ways. Let’s find out together what the deep neural network is and how it works.
From brain to neural network
Let's go back for a moment to the way our brain works: the center of the nervous system, it is able to integrate information, control motor skills and ensure cognitive functions. The central element of the brain is the neuron.
Neurons communication
A neuron consists of a cell body, an axon which represents the signal transmission link and a synapse which allows the triggering of an action potential in the neuron to activate communication with another neuron. You should know that the strength of a neural network resides in the communication of its neurons through electrical signals called "nerve impulses". These signals are characterized by frequencies which play an important role in the propagation of the signals in the network in question.
Below is a representation of a neuron, a diagram of a biological neuron and synapse.
The nerve impulse travels along the axon to finish its path at the synaptic termination. The higher the frequency of it, the more the neuron produces chemicals: neurotransmitters (or neurotransmitters).
It is contained in the vesicles; the latter are released into the extracellular environment at the level of the synapse. They, in turn, will activate or inhibit a second neuron at the level of its dendrite or its cell body. The nerve impulse continues its path along this second neuron and so on.
To summarize simply:
The information is received by the dendrites as an influx of neurotransmitters, collects in the cell body and flows down the axon. Each neuron is connected to several neurons at “input”, i.e. at the level of its dendrites and at “output”, i.e. at the level of its axon. It is this “input/processing/output” functioning that inspired research and led to attempts to reproduce a neuron in an artificial way.
A neuron in machine language:
The concept of artificial neurons is part of the fields of research and action of Deep Learning, to such an extent that the two subjects are often wrongly confused. The goal of Deep Learning is as follows: to predict a Y output (a characteristic) through an input Xi dataset, called observations. For example, see a long (X1), (X2), (X3), and (X4) round shape and predict that it is a banana (w). One of the means of achieving this, highlighted by the research, was to simulate the response of a called “artificial” neuron to these observations and to develop an algorithm making it possible to process and weight the observations in order to predict a characteristic.
The graph above is a representation of an artificial neuron; the dendrites receive signals; the frequency of each signal depends on the weight Wi. These signals are then transmitted to the cell body. The latter is characterized by two important elements, the activation function and the activation condition. It is by checking the activation condition that we activate the axon to again emit signals to the neurons that follow through their dendrites and so on. This representation is called the perceptron, a supervised learning algorithm for linear binary classifications. That's a lot of unknown words so let's focus on each of these terms for a moment because they will be important for the future!
Algorithm: the perceptron is a series of operations and calculations = the sum of the inputs, their weighting, the verification of a condition and the production of an activation result.
Learning: the algorithm must be “trained”, that is to say, that according to the desired prediction, the weight of the different inputs will change and an optimal value must be found for each.
Supervised: the algorithm finds the optimal values of its weights from a database of examples of which we already know the prediction. For example, we have a database of banana photos, and we “tune” our algorithm until almost every photo is classified as a banana.
Classification: the algorithm makes it possible to predict an output characteristic, and this characteristic is used to classify the different inputs among themselves. For example, find all the bananas in a fruit photo panel.
Binary: the algorithm separates a set of input values into just two different classes (banana or other)
Linear: the algorithm separates a set of values in a linear fashion. Take the example of a sheet of paper where you have drawn a straight diagonal. This diagonal is a linear separation of the sheet of paper. If you had drawn a circle in the middle of the sheet, the separation would be considered nonlinear.
How about going back to the source?
The previous algo is named: Simple Perceptron. The appearance of the perceptron took place in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. Before manifesting itself as an algorithm, the perceptron was initially conceived as a machine. The aim of the latter was to image recognition through a network of 400 photocells linked in a random manner (artificial representation of neural links). The weights were encoded in potentiometers, and their updates were carried out through electric motors.
The simple perceptron algorithm has one major drawback when applied to real-life cases: its linearity. Indeed, not everything can be separated and classified by drawing a line on a sheet of paper. The example of the banana classifier used above is proof of this. So we had to find a way to respond to this non-linearity, and this is where the notion of depth arrived.
The passage from a single layer of neurons or perceptron to multiple layers of neurons or perceptron is one of its solutions. It is the advent of deep neural networks.
Understanding Deep Neural Networks
A deep neural network consists of thousands or even millions of neurons organized in layers interconnected by weighted links (Wi). Each node is characterized by the set {Fi (activation function) + Ci (activation condition)}. The job of learning is, therefore, to find the optimal combination of Wi / Fi / Ci to generate the best results in terms of prediction and estimation. We have now just dealt with nonlinearity, but how now can we train our network and find the values of its millions of weights? Among the various possible solutions, one of the most used is backpropagation.
Backpropagation: How is this learning done in practice?
Image recognition: One of the most interesting applications
The idea is to be able to classify one or more images, for example: is this photo that of a face?. To do this, it is first of all crucial to converting our image into a matrix (the dimensions being the number of pixels) which will be the input of our classifier (Linear / nonlinear regression, Perceptron, Neural network etc.).
The first deep network designed for image recognition was developed by Yann LeCun, the artificial intelligence researcher at Facebook. The network was set up in 1998 and was named LeNet.
Convolution
The first step in this architecture is that of the convolution of the image that we want to classify. Convolution is the process of highlighting a few well-chosen features in the source image in order to have the same image, but with some sort of filter, as you can see in the image of step 2, the focus is rather put on the contour of the face. This is done by applying a predefined matrix to the source matrix of the initial image. You should know that there are several convolution matrices, and each of them targets a particular aspect of the image. During this first step, several convolution filters are applied to the initial image, thus leading to the generation of several distinct images at the level of the shape.
Pooling
The second step, called pooling, consists of reducing the dimensions of the images (matrices). The goal of this step is to keep as much relevant information as possible even by reducing the dimensions. The most used pooling functions are (max / avg. pooling).
The Convolution and Pooling steps are then iterated as many times as necessary until all the characteristics of the image are classified (for example the outline of the face, the eyes, the nose, the mouth, etc.)
Fully connected
The completely connected layers step is quite simply the application of a classical neural network to all the characteristics classified by the previous steps. The output of this network is finally reduced to a probability which makes it possible to specify to what degree the image in question is indeed a face.
The ImageNet Competition
The ImageNet competition is an international challenge where research teams compete against each other using image recognition algorithms. Everyone has the ImageNet database and must provide the most accurate classification possible. During this competition, the impact of the introduction of convolutional networks was impressive.
Author: Vicki Lezama