The other day, I decided to read one paper about neural networks a day. I chose publications available at Arxiv and I added the site to my RSS reader. And just two days later I was shocked by the number of papers submitted to Arxiv every day! There were so many new ideas in the field of neural networks that it was impossible to follow them all. I gave up this habit after a week, but I realized one important thing: the majority of these papers were related to tuning or modifying already existing types of neural networks.
In this post, I’ll describe the four basic types of neural networks: Perceptrons, Auto-Encoders, Recurrent Neural Networks, and Deep Convolutional Networks. They differ in shape and usage.
Let’s start by introducing a neural network: it is a computing system that is inspired by networks of neurons that you have in your brain. They have connections (like synapses) that transmit data between neurons. A neuron receives a signal, processes it, and then sends to the associated neighbors.
Networks are not pre-programmed, so they learn by analyzing examples. Here lies the magic – they can find correlations and patterns hidden from the human eye.
Perceptrons are the simplest neural networks. They consist of two layers – the input and the output.
The input values are multiplied by weights and summed up. The sum constitutes the input of the activation function that determines the output. The function’s shape depends on the task you expect the network to perform.
The network stores its knowledge in weights (a matrix of numbers). The process of adjusting weights is called a learning process.
You can develop this simple model by adding a layer called a hidden layer. In these networks, you might find fully connected layers – a node from the nth layer passes data to all of the nodes in the (n+1)th layer.
This kind of networks is called feed-forward – the data flow from the input to the output so there are no cycles in connections.
Since these networks are very simple, they perform simple tasks like handwritten digits recognition or data classification.
Auto-encoders are simple but very powerful networks. You train these networks in an unsupervised manner, so no labeling is required. The only thing that matters is that the expected output should be very similar to the passed input. The hidden layer has fewer neurons, so the network compresses input values and restores them on the output layer. Of course, you can create as many hidden layers as you want.
Auto-encoders can perform dimensionality reduction. This technique is essential when you design sophisticated ML and AI systems that have many features.
The mushroom on the left has more details and colors. The network reassembled the one on the right. The essential parts preserved but the details are missing. The network performed lossy compression.
Recurrent Neural Networks / Long/Short Term Memory
RNNs are more complicated than perceptrons. In this kind of networks, you can spot recurrent connections. Cells in hidden layers take their output from the previous step as part of the input. The other part is a vector of values from the preceding layer.
LSTMs are a subset of RNN. The difference is that they consist of cells that have memory, so they understand the context of a given input. You can configure how much data they should remember or forget.
You can use this type of networks for a bit more complicated tasks like speech recognition, text translation, time series prediction, text auto-completion, and sentiment analysis.
Deep Convolutional Networks
DCNs are the most popular networks. They mainly consist of convolution layers that process input data and pooling layers that simplify data.
Convolutional networks imitate the natural image recognition process – connectivity between patterns resembles the organization of the visual cortex.
The most significant advantage of this type of networks is that they require almost no input pre-processing. They learn the filters automatically.
They are used mainly for image recognition and language processing. DCNs might also work on drug discovery.
Highly specialized systems use the described networks as foundations. Sometimes they are not sufficient for solving real-world problems.
In this article, I haven’t discussed even a small percent of networks. If you are interested in a broader range of networks, please see the diagram below.
I would strongly recommend that you read about Generative Adversarial Networks (GANs). You can use them for image generation. Every GAN instance consists of two coworking neural networks – one that generates a candidate and one that evaluates it. Their popularity is growing day by day.
Neural networks are powerful tools. You use hundreds of applications that take advantage of them. Scientists have proven that their application is limited only by our creativity.
Edited by: Anna