They are motivated by the architecture and functionality of neuron cells, of which brains are made of. The neurons in the brain can receive multiple input signals, process them and fire a signal which again can be input to other neurons. The output is binary, so the signal can be fired (1) or not be fired (0) which depends on the input.
The artificial neuron has some inputs which we call x1,x2,...xp. There can be an additional input x0, which is always set to 1 and is often referred to as bias. The inputs can be weighted with weights w1,w2,...,wp and w0 for the bias. With the input and the weights we can calculate the activation of the neuron ai=p∑k=1wkxik+w0
.
The output of the neuron is a function of it's activation. Here we are free to choose whatever function we want to use. If our output shall be binary or in the intervall [0,1] a good choice is the logistic function.
So the calculated output for the neuron and the observation i is oi=11+exp(−ai)
Pretty straightforward, isn't it? If you know about logistic regression this might be already familiar to you.
Now you know about the basic structure. The next step is to "learn" the right weights for the input. Therefore you need a so called loss function which tells you how wrong you are with your predicted output. The loss function is a function of your calculated output oi (which depends on your data xi1,...,xip and the weights) and of course on the true output yi. Your training data set is given, so the only variable part of the loss function are your weights wk. Low values of the loss function tell you, that you make an accurate prediction with your neuron.
One simple loss function would be the simple difference yi−oi. A more sophisticated function is yiln(oi)⋅(1−yi)ln(1−oi), which is the negative log- Likelihood of your data, if you see oi as the probability that the output is 1. So minimizing the negative log - Likelihood is the same as maximizing the Likelihood of your parameters given your training data set.
The first step for learning about neural networks is made! The next thing to look at is the gradient descent algorithm. This algorithm is a way to find weights, which minimize the loss function.
Have fun!
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.