logo

What is an Artificial Neural Network? 📂Machine Learning

What is an Artificial Neural Network?

Definition

An artificial neural network (ANN) is a network mimicking the nervous system of actual organisms.

Mathematical Definition

Motivation

The nervous system is composed of neurons. A nerve cell body receives stimuli through dendrites and transmits electrical stimuli through axons. Many organisms, including humans, have evolved such simple neuronal connections to be suitable for their environments. As a result, the nervous system is capable of complex tasks such as detecting light, moving legs, remembering, or imagining.

20190317\_195625.png

20190317\_201846.png

The artificial neural network refers to a network that mimics neurons, with the nerve cell body as a node and the axon as a link. Each node, like a nerve cell body, performs calculations that can yield meaningful results from receiving and transmitting information.

Example

As a simple example, consider the problem of understanding the correlation between data $Y := \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix}$ and $X := \begin{bmatrix} 2.2 \\ 3.1 \\ 3.9 \end{bmatrix}$ regarding $Y$ and $X$.

Since this problem is quite easy, one can guess without difficulty that there is a linear correlation like $Y \approx {\color{red}2} X + \color{blue}{1}$.

If we solve this problem through simple regression analysis $Y \gets X$, it becomes a problem of finding the least square solution $( \color{blue} {\beta_{0} } , {\color{red}\beta_{1}} )$ when represented by a design matrix. $$ \begin{bmatrix} 5 \\ 7 \\ 9 \end{bmatrix} = \begin{bmatrix} 1 & 2.2 \\ 1 & 3.1 \\ 1 & 3.9 \end{bmatrix} \begin{bmatrix} \color{blue} {\beta_{0} } \\ {\color{red}\beta_{1}} \end{bmatrix} $$

On the other hand, the artificial neural network for this problem can be configured as follows:

20190317\_205730.png

First, let’s assume a relationship like $Y = {\color{red}w} X + \color{blue}{b}$. In this case, $\color{red}{w}$ is called the Weight, and $\color{blue}{b}$ is called the Bias. The node receiving the given data $\begin{bmatrix} 2.2 \\ 3.1 \\ 3.9 \end{bmatrix}$ first randomly calculates $\begin{bmatrix} {\color{red}w_{1}} 2.2 +\color{blue}{b_{1}} \\ {\color{red}w_{1}} 3.1 +\color{blue}{b_{1}} \\ {\color{red}w_{1}} 3.9 +\color{blue}{b_{1}} \end{bmatrix}$ using $( {\color{red}w_{1}} , \color{blue}{b_{1}} )$ and passes it to node $Y$.

20190317\_205604.png

If these roughly guessed values are unsatisfactory, one continues to update with better weights until satisfactory results are obtained.

20190317\_210511.png

In this sense, artificial neural networks can be seen as implementing the concept of Machine Learning, where machines learn on their own, and this process has evolved into Deep Learning, which is more complex yet efficient.

Theoretical Aspect

Those familiar with statistics or mathematics often express strong aversion to these techniques due to the lack of theoretical foundations. There is no known condition for minimizing errors or optimizing learning, and often the reasons why certain functions are used are unknown. But if a new paper’s technique improves performance in benchmarks, there is nothing to argue against it.

While there might have been scholars who attempted a mathematical approach to these things, the sad reality is that by the time some progress is made in research, the industry has already moved on, rendering these efforts outdated. From the perspective of someone studying theory, it seems hardly worthwhile.

Nevertheless, these techniques cannot be underestimated because the performance is too good to mistrust the results. Deep learning is an irresistible temptation in data science. Even if the trend passes quickly, the performance is overwhelmingly beneficial to learn, and while it may not be as rigorous as mathematics, the field is laying its own theoretical foundations so it’s not bad to keep an open mind.

See Also