Side ASide BUnlabelled DataText ProcessingUnsupervised LearningImage R

1.	Side ASide BUnlabelled DataText ProcessingUnsupervised LearningImage RecognitionObject RecognitionSpeech RecognitionClassificationRecurrent Net (RNTN)Restricted Boltzmann Machine (RBM)Deep Belief Nets (DBN)Convolutional Neural Nets (CNN)MLP/RELUAutoencoders
Answer» All of the above / Option d is CORRECT option. Now coming to second part of question for the explanation, below is described: With a method CALLED BACKPROPAGATION, we run into a problem called vanishing gradient or sometimes the exploding gradient. When that happens, training takes too long and accuracy really suffers. For example, when we are training a neural net, we are constantly calculating a cost value. The cost is TYPICALLY difference between net’s predicted output and the actual output from a set of labelled training data. The cost is then lowered by making slight adjustments to the weights and biases over and over throughout the training process, until the lowest possible value is obtained. The training process utilizes a “gradient”, which measures the rate at which the cost will change w.r.t. a change in a weight or a bias. Early layers of a network are slowest to train, early layers are also RESPONSIBLE for early detection of features and building blocks. If we consider the face detection, early layers are important to figure out edges to correctly identify the face and then pass on the details to later layers where it’s features are captured and consolidated to be able to provide final output.

Side ASide BUnlabelled DataText ProcessingUnsupervised LearningImage RecognitionObject RecognitionSpeech RecognitionClassificationRecurrent Net (RNTN)Restricted Boltzmann Machine (RBM)Deep Belief Nets (DBN)Convolutional Neural Nets (CNN)MLP/RELUAutoencoders

Answer»

All of the above / Option d is CORRECT option.

Now coming to second part of question for the explanation, below is described:

With a method CALLED BACKPROPAGATION, we run into a problem called vanishing gradient or sometimes the exploding gradient. When that happens, training takes too long and accuracy really suffers.

For example, when we are training a neural net, we are constantly calculating a cost value. The cost is TYPICALLY difference between net’s predicted output and the actual output from a set of labelled training data. The cost is then lowered by making slight adjustments to the weights and biases over and over throughout the training process, until the lowest possible value is obtained. The training process utilizes a “gradient”, which measures the rate at which the cost will change w.r.t. a change in a weight or a bias.

Early layers of a network are slowest to train, early layers are also RESPONSIBLE for early detection of features and building blocks. If we consider the face detection, early layers are important to figure out edges to correctly identify the face and then pass on the details to later layers where it’s features are captured and consolidated to be able to provide final output.

Side ASide BUnlabelled DataText ProcessingUnsupervised LearningImage RecognitionObject RecognitionSpeech RecognitionClassificationRecurrent Net (RNTN)Restricted Boltzmann Machine (RBM)Deep Belief Nets (DBN)Convolutional Neural Nets (CNN)MLP/RELUAutoencoders

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment