1. Answer the following questions in a Word document. Each answer must be in you
ID: 3828243 • Letter: 1
Question
1. Answer the following questions in a Word document. Each answer must be in your own words, brief and to the point, using not more than 2-4 paragraphs for each answer.
a. How does a Convolutional Neural Network work? Make sure to include explanations of the different layer types used.
b. Explain drop-out. What is it used for?
c. What are Recurrent Neural Networks? What are LSTM based networks?
d. How does a multi-layer perceptron (MLP) work? Make sure to include explanations of the techniques: feed-forward, backpropagation, full-connectivity, activation functions, and hidden layers.
Explanation / Answer
Following are answers of the questions.
1)How does a Convolutional Neural Network work? Make sure to include explanations of the different layer types used.
Ans)
Convolutional Neural Networks(CNNs/ConvNets) are similar to ordinary Neural Networks. They are made up of neurons which have learnable weights and biases. Each neurons takes some inputs, make a dot product and follows optionally with a non-linearity. The whole CNNs expresses a single differentiable score function such as from raw image pixesl to class scores on the other end and also have a loss function on the last layer(fully-connected).
There are four important steps in CNNs. They are convolution, subsampling, activation and full connectedness.
a)Convolution: Convolution is the first layer which receives input signaland are called as convolution filters. It is a process where network tries to label input signaly by referring to what it learnt in the past.
b)Subsampling: Once convolution layer is done, it can be passed as input to the next layer which is subsampling to smoothen so that sensitivity of filters to noise and variations are reduced. This smoothing process is called subsampling and is achieved by taking averages or taking maximum over a sample of the signal.
c)Activation: The activation layer controls how signal flows from one layer to the other emulating how neurons fired in brain. Output signals which are strongly related with past references activate more neurons enabling signals to propagate more efficiently for identification.
d)Full-Connectedness: This layer means the neurons of preceeding layers are connected to every neuron in subsequent layers. It actually emulate high level reasoning where all possible pathways from input to output are considered.
e)While training neural network there is another layer called loss layer which provides feedback of whether inputs identified correctly. If not how far it guessed correctly. This helps neural network to reinforce right concepts while it trains.
2)Explain drop-out. What is it used for?
Ans)
Dropout is a technique where randomly selected neurons are ignored during training which means their contribution to activation of downstream neurons is temporarily removed on forward pass and any weight updates are not applied to neuron on backward pass.
If neurons are randonly dropped out during training then other neurons will have to setp in to handle the representation required to make predictions for missing neurons. This results in multiple independent internal representations being learned by network.
3)What are Recurrent Neural Networks? What are LSTM based networks?
Ans)
A recurrant neural network(RNN) has connections between units which form a directed cycle. It creates an internal network state that allows to exhibit dynamic temporal behaviour. RNNs use internal memory to process arbitrary input sequences which makes them useful in applications like speech recognition.
LSTM is a recurrent neural network architecture that remembers values for either long or short time durations. It do not use any activation function within its recurring components. So the stored value is not iteratively squashed over time and the gradient or blame term does not tend to vanish when back propagation through time is applied to train it.
4)How does a multi-layer perceptron (MLP) work? Make sure to include explanations of the techniques: feed-forward, backpropagation, full-connectivity, activation functions, and hidden layers.
Ans)
Multi-layer Perceptron (MLP) is a supervised learning algorithm which has a training set of input-output pairs and then network must learn to model dependency between them. It is a network of simple neurons called perceptrons where the perceptron computes a single output from multiple real-valued inputs by formaing a linear combination accoridng to its input weights and then possibly putting output through some nonlinear activation function.
feed-forward:
A feed-forward is a type of neural network where connections are fed forward. That is do not form cycles. The term feed forward also is used when giving input something at input layer and it travels form input to hidden and from hidden to output layer.
backpropagation:
Backpropagation is a training algorithm which working of the algorithm happens in 2 steps.
1) Feed forward the values
2) calculate the error and propagate back to earlier layers.
forward-propagation is part of the backpropagation algorithm but comes before back-propagating
full-connectivity:
In neural networks high level reasoning is done via fully connected layers. Neurons in this layer have full connections
to all activations in the the previous layer. The activations are computed with a matrix multiplication followed by a bias offset.
activation functions:
The activation functions are used to transform the activation level of a neuron into an output signal. There are number of common activation functions in use with neural networkds. The most common choice of activation functions for MLP is used as transfer functions in research and engineering. Among the reasons for popularity are its boundedness in unit internal, function and its derivative fast computability and a number of amenable mathematical properties in realm of approximation theory.
hidden layers:
The size of neural network is calculated by number of hidden neurons, network complexity by number of hidden neurons and their interrelationship. If network is too small it will not completely solve a particular problem or parts of it. And in a very high complex network there is a tendency to memorize the data. It focuses too much on data presented for learning and trends toward modelling the noise. Hence these cases are called under or over fitting.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.