Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

What do you mean by an epochs in the context of deep learning?

Answer»

An EPOCH is a TERMINOLOGY used in deep learning that refers to the number of passes the deep learning algorithm has made across the full training dataset. Batches are commonly used to GROUP DATA sets (especially when the amount of data is very large). The term "iteration" refers to the process of running one batch through the model.

The number of epochs equals the number of iterations if the batch size is the entire training dataset. This is frequently not the case for practical reasons. Several epochs are used in the creation of many models.

There is a general relation which is given by:-

d * e = i * b

where,

d is the dataset size

e is the number of epochs

i is the number of iterations

b is the batch size

2.

What is an activation function? What is the use of an activation function?

Answer»

An artificial neural network's activation function is a function that is introduced to help the network learn complex patterns in the data. When compared to a neuron-based model seen in our brains, the activation function is responsible for determining what is to be fired to the next neuron at the end of the process. In an ANN, an activation function performs the same job. It takes the preceding cell's output signal and turns it into a FORMAT that may be used as input to the next cell.

Here, x0 and x1 are the inputs. W1 is the WEIGHT and a is the activation function. 

The activation function introduces non-linearity into the neural network, allowing it to learn more complex functions. The neural network would only be able to learn a function that is a linear combination of its input data if it didn't have the Activation function.

The activation function converts inputs to outputs. The activation function is in charge of determining WHETHER or not a neuron should be stimulated. It arrives at a decision by calculating the weighted total and then adds bias. The activation function's main goal is to introduce non-linearity into a neuron's output.

3.

Which deep learning algorithm is the best for face detection?

Answer»

Face identification may be accomplished using a variety of machine learning METHODS, but the best ONES use Convolutional NEURAL Networks and deep learning. The following are some NOTABLE face detection algorithms: FaceNet, Probablisit, Face Embedding, ArcFace, Cosface, and Spherface.

4.

Explain Stochastic Gradient Descent. How is it different from Batch Gradient Descent?

Answer»

Stochastic Gradient Descent: Stochastic Gradient Descent seeks to tackle the major difficulty with Batch Gradient Descent, which is the use of the entire training set to CALCULATE gradients at each STEP. It is stochastic in nature, which means it chooses up a "random" instance of training data at each step and then computes the gradient, which is SIGNIFICANTLY faster than Batch Gradient Descent because there are much fewer data to modify at once. Stochastic Gradient Descent is best suited for unconstrained optimization problems. The stochastic nature of SGD has a drawback in that once it gets close to the MINIMUM value, it doesn't settle down and instead bounces around, giving us a GOOD but not optimal value for model parameters. This can be solved by lowering the learning rate at each step, which will reduce the bouncing and allow SGD to settle down at the global minimum after some time.

Following are the differences between the two:-

Batch Gradient DescentStochastic Gradient Descent
The gradient is calculated using the entire training dataset.A single training sample is used to compute the gradient.
It is slow and computationally more expensive than Stochastic Gradient Descent.It is faster and less computationally expensive than Batch Gradient Descent.
It is not recommended for large training samples. It is recommended for large training samples.
It is deterministic (not random) in nature.It is stochastic (random) in nature.
Given enough time to converge, it returns the best answer.It provides a good solution, but not the best.
There is no need to shuffle the data points at random.Because we want the data sample to be in a random order, we'll shuffle the training set for each epoch.
In this, it is difficult to get out of shallow local minimas.It has a better chance of escaping shallow local minima.
In this, the convergence is slow. It arrives at the convergence point substantially faster.
5.

Explain Batch Gradient Descent.

Answer»

BATCH Gradient Descent: Batch Gradient Descent entails computation (involved in each step of gradient descent) over the entire TRAINING SET at each step and hence it is highly slow on very big training sets. As a result, Batch Gradient Descent becomes extremely computationally expensive. This is ideal for error manifolds that are convex or somewhat smooth. Batch Gradient Descent also scales nicely as the NUMBER of features grows.

6.

In a Convolutional Neural Network (CNN), how can you fix the constant validation accuracy?

Answer»

When training any neural network, constant validation accuracy is a common issue because the network just REMEMBERS the SAMPLE, resulting in an over-fitting problem. Over-fitting a model indicates that the neural network model performs ADMIRABLY on the training sample, but the model's PERFORMANCE deteriorates on the validation set. Following are some ways for improving CNN's constant validation accuracy:

  • It is always a good idea to split the dataset into three sections: training, validation, and testing.
  • When working with limited data, this difficulty can be handled by experimenting with the neural network's parameters.
  • By increasing the training dataset's size.
  • By using BATCH normalization.
  • By implementing regularization
  • By reducing the complexity of the network
7.

Explain the difference between a shallow network and a deep network.

Answer»

A HIDDEN layer, as well as INPUT and output layers, are present in every neural network. Shallow neural networks are those that have only one hidden layer, whereas DEEP neural networks include numerous hidden layers. Both shallow and deep networks can fit into any function, however, shallow networks require a large number of input parameters, whereas deep networks, because of their several layers, can fit FUNCTIONS with a small number of input parameters. Deep networks are currently favored over shallow networks because the model learns a new and abstract representation of the input at each layer. In COMPARISON to shallow networks, they are also far more efficient in terms of the number of parameters and computations.

8.

What is a tensor in deep learning?

Answer»

A tensor is a multidimensional array that represents a generalization of vectors and matrices. It is one of the KEY data structures used in DEEP learning. Tensors are represented as n-dimensional arrays of base data types. The data TYPE of each element in the Tensor is the same, and the data type is always known. It's possible that only a portion of the shape (that is, the number of dimensions and the SIZE of each dimension) is known. Most operations yield fully-known tensors if their inputs are likewise fully known, however, in other circumstances, the shape of a tensor can only be determined at graph execution TIME.

9.

Is it possible to train a neural network model by setting all biases to 0? Also, is it possible to train a neural network model by setting all of the weights to 0?

Answer»

Yes, even if all of the biases are set to zero, the neural network model has a chance of learning. 

No, TRAINING a model by setting all of the weights to 0 is impossible SINCE the neural network will never learn to complete a task. When all weights are set to zero, the DERIVATIVES for each w remain constant, resulting in NEURONS learning the same features in each iteration. Any constant initialization of weights, not simply zero, is likely to generate a poor result.

10.

What are the advantages of transfer learning?

Answer»

Following are the advantages of transfer LEARNING :

  • Better initial model: In other methods of learning, you must create a model from scratch. Transfer learning is a better starting point because it allows us to perform tasks at a higher level without having to KNOW the details of the starting model.
  • Higher learning rate: Because the problem has already been TAUGHT for a similar task, transfer learning allows for a faster learning rate during training.
  • Higher accuracy after training: Transfer learning allows a deep learning model to converge at a higher performance level, RESULTING in more accurate output, thanks to a better starting point and higher learning rate.
11.

Explain transfer learning in the context of deep learning.

Answer»

Transfer learning is a learning technique that allows DATA scientists to use what they've learned from a previous machine learning model that was used for a similar task. The ability of humans to transfer their knowledge is used as an example in this learning. You can LEARN to operate other two-wheeled vehicles more simply if you learn to ride a bicycle. A model trained for autonomous automobile driving can also be used for autonomous TRUCK driving. The features and weights can be used to train the new model, allowing it to be reused. When there is limited data, transfer learning works effectively for quickly training a model.

In the above image, the FIRST diagram represents training a model from scratch while the second diagram represents using a model already trained on cats and dogs to classify the different class of vehicles, thereby REPRESENTING transfer learning.

12.

Difference between multi-class and multi-label classification problems.

Answer»

The classification task in a multi-class classification problem has more than two mutually exclusive classes (classes that have no intersection or no attributes in common), whereas in a multi-LABEL classification problem, each label has a different classification task, although the tasks are related in some way. For example, classifying a group of photographs of ANIMALS that could be cats, dogs, or bears is a multi-class classification problem that assumes each sample can be of only one TYPE, implying that an image can be categorized as either a cat or a DOG, but not both at the same time.

Now let us assume you wish to manipulate the image below. 

The image above must be categorized as both a cat and a dog because it DEPICTS both creatures. A set of labels is allocated to each sample in a multi-label classification issue, and the classes are not mutually exclusive. In a multi-label classification problem, a pattern can belong to one or more classes.

13.

What do you mean by hyperparameters in the context of deep learning?

Answer»

Hyperparameters are variables that determine the network topology (for example, the number of hidden units) and how the network is trained (Eg: Learning RATE). They are SET before training the MODEL, that is, before optimizing the weights and the bias. 

Following are some of the examples of hyperparameters:-

  • Number of hidden layers: With regularisation techniques, many hidden units inside a layer can boost accuracy. Underfitting may occur if the number of units is reduced. 
  • Learning Rate: The learning rate is the rate at which a network's parameters are UPDATED. The learning process is slowed by a low learning rate, but it EVENTUALLY converges. A faster learning rate accelerates the learning process, but it may not converge. A declining Learning rate is usually desired.
14.

What are the different techniques to achieve data normalization?

Answer»

Following are the different techniques employed to achieve data normalization:-

  • Rescaling: Rescaling data is the process of multiplying each MEMBER of a data set by a constant term K, or CHANGING each integer x to f(X), where f(x) = kx and k and x are both real values. The simplest of all approaches, rescaling (also KNOWN as "min-max normalization"), is CALCULATED as:

      x'=x-min(x)max(x)-min(x){"detectHand":false} 

    This represents the rescaling factor for every data point x.
  • Mean Normalisation: In the transformation process, this approach employs the mean of the observations:

     x'=x-average(x)max(x)-min(x){"detectHand":false}

    This represents the mean normalizing factor for every data point x.
  • Z-score Normalisation: This technique, also known as standardization, employs the Z-score or "standard score." SVM and logistic regression are two examples of machine learning algorithms that utilise it:

     z=x-μσ{"detectHand":false}

    This represents the Z-score.
15.

Explain Data Normalisation. What is the need for it?

Answer»

Data Normalisation is a technique in which data is transformed in such a WAY that they are EITHER dimensionless or have a SIMILAR distribution. It is also known as standardization and feature SCALING. It's a pre-processing procedure for the input data that removes redundant data from the dataset.

Normalization provides each variable equal weights/importance, ensuring that no single variable biases model performance in its favour SIMPLY because it is larger. It vastly improves model precision by converting the values of numeric columns in a dataset to a similar scale without distorting the range of values.

16.

Explain Forward and Back Propagation in the context of deep learning.

Answer»
  • Forward Propagation: The hidden layer, between the input layer and the output layer of the network, receives inputs with weights. We calculate the output of the ACTIVATION at each node at each hidden layer, and this propagates to the next layer until we reach the FINAL output layer. We go forward from the inputs to the final output layer, which is known as the forward propagation.
  • Back Propagation: It sends error information from the network's last layer to all of the weights within the network. It's a technique for fine-tuning the weights of a neural network BASED on the previous epoch's (i.e., iteration) error rate. By fine-tuning the weights, you may lower error rates and improve the model's generalization, making it more dependable. The process of backpropagation can be broken down into the following steps: It can generate output by propagating training data through the network. It, then, COMPUTES the error derivative for output ACTIVATIONS using the target and output values. It can backpropagate to compute the derivative of the error in the previous layer's output activation, and so on for all hidden layers. It calculates the error derivative for weights using the previously obtained derivatives and all hidden layers. The weights are updated based on the error derivatives obtained from the next layer.
17.

What do you understand about gradient clipping in the context of deep learning?

Answer»

Gradient Clipping is a technique for dealing with the problem of exploding gradients (a situation in which huge error gradients build up over time, resulting in massive modifications to neural network model weights during training) that happens during backpropagation. The problem of exploding gradients occurs when the gradients get excessively big during training, causing the model to become unstable. If the gradient has CROSSED the anticipated range, the gradient VALUES are driven element-by-element to a specific minimum or MAXIMUM VALUE. Gradient clipping improves numerical STABILITY while training a neural network, but it has little effect on the performance of the model.

18.

What do you mean by end-to-end learning?

Answer»

It's a deep learning procedure in which a model is fed raw data and the entire data is trained at the same time to create the desired result with no INTERMEDIATE steps. It is a deep learning method in which all of the different steps are trained SIMULTANEOUSLY rather than sequentially. End-to-end learning has the advantage of eliminating the requirement for implicit feature engineering, which usually results in lower bias. Driverless automobiles are an excellent example that you MAY use in your end-to-end learning content. They are guided by human input and are programmed to learn and interpret information automatically using a CNN to FULFILL tasks. Another good example is the generation of a written TRANSCRIPT (output) from a recorded audio clip (input). The model here skips all of the steps in the middle, focusing instead on the fact that it can manage the entire sequence of steps and tasks.

19.

What are the different types of deep neural networks?

Answer»

FOLLOWING are the different types of deep neural networks:-

  • FeedForward Neural Network:- This is the most basic type of neural network, in which flow control starts at the input layer and moves to the output layer. These networks only have a single layer or a single hidden layer. There is no backpropagation mechanism in this network because data only flows in one way. The input layer of this network receives the sum of the weights present in the input. These networks are utilised in the computer vision-based facial recognition method.
  • Radial Basis Function Neural Network:- This type of neural network usually has more than one layer, preferably two. The relative distance from any location to the center is determined in this type of network and passed on to the next layer. In order to avoid blackouts, radial basis networks are commonly employed in power restoration systems to restore power in the shortest period possible.
  • Multi-Layer Perceptrons (MLP):- A multilayer perceptron (MLP) is a type of feedforward artificial neural network (ANN). MLPs are the simplest deep neural networks, consisting of a succession of completely linked layers. Each successive layer is made up of a collection of nonlinear functions that are the weighted sum of all the previous layer's outputs (completely linked). Speech recognition and other machine learning systems rely heavily on these networks.
  • Convolutional Neural Network (CNN):- Convolutional Neural Networks are mostly used in computer vision. In contrast to fully linked layers in MLPs, one or more convolution layers extract simple characteristics from input by performing convolution operations in CNN models. Each layer is made up of nonlinear functions of weighted sums at various coordinates of spatially close subsets of the previous layer's outputs, allowing the weights to be reused.
    The AI system learns to automatically extract the properties of these inputs to fulfill a specific task, such as picture classification, face identification, and image semantic segmentation, given a sequence of images or videos from the ACTUAL world.
  • Recurrent Neural Network (RNN):- Recurrent Neural Networks were created to solve the sequential input data time-series problem. RNN's input is made up of the current input and prior samples. As a result, the node connections create a directed graph. Furthermore, each neuron in an RNN has an internal memory that stores the information from previous samples' computations. Because of their superiority in processing data with a variable input length, RNN models are commonly employed in natural language processing (NLP). The goal of AI in this case is to create a system that can understand human-spoken natural languages, such as natural language modeling, word embedding, and machine translation.
    Each successive layer in an RNN is made up of nonlinear functions of weighted sums of outputs and the preceding state. As a result, the basic unit of RNN is termed "cell," and each cell is made up of layers and a succession of cells that allow recurrent neural network models to be PROCESSED sequentially.
  • Modular Neural Network:- This network is made up of numerous tiny neural networks, rather than being a single network. The sub-networks combine to form a larger neural network, which operates independently to ACHIEVE a common goal. These networks are extremely USEFUL for breaking down a large-small problem into smaller chunks and then solving it.
  • Sequence to Sequence Model:-  In most cases, this network is made up of two RNN networks. The network is based on encoding and decoding, which means it has an encoder that processes the input and a decoder that processes the output. This type of network is commonly employed for text processing when the length of the inputting text differs from the length of the outputted text.
20.

Explain what a deep neural network is.

Answer»

An ARTIFICIAL neural network (ANN) having numerous layers between the input and output layers is known as a deep neural network (DNN). Deep neural networks are neural networks that use deep architectures. The term "deep" refers to functions that have a HIGHER number of layers and units in a single layer. It is POSSIBLE to create more ACCURATE models by adding more and larger layers to capture higher levels of patterns. The below IMAGE depicts a deep neural network.

21.

What are the disadvantages of neural networks?

Answer»

Following are the disadvantages of NEURAL networks:-

  • The "black box" aspect of neural networks is a well-known disadvantage. That is, we have no idea how or why our neural network produced a certain result. When we enter a dog image into a neural network and it predicts that it is a duck, we may FIND it CHALLENGING to understand what prompted it to MAKE this prediction.
  • It takes a long time to create a neural network model.
  • Neural networks models are COMPUTATIONALLY expensive to build because a lot of computations need to be done at each layer.
  • A neural network model requires significantly more data than a traditional machine learning model to train.
22.

What are the advantages of neural networks?

Answer»

Following are the advantages of neural networks:

  • Neural networks are EXTREMELY adaptable, and they may be used for both CLASSIFICATION and regression problems, as WELL as much more complex problems. Neural networks are also quite scalable. We can create as many layers as we wish, each with its own set of neurons. When there are a lot of data points, neural networks have been shown to generate the best outcomes. They are best used with non-linear data such as IMAGES, text, and so on. They can be applied to any data that can be transformed into a numerical value.
  • Once the neural network mode has been TRAINED, they deliver output very fast. Thus, they are time-effective.
23.

Explain learning rate in the context of neural network models. What happens if the learning rate is too high or too low?

Answer»

Learning rate is a number that ranges from 0 to 1. It is one of the most important tunable hyperparameters in neural network training models. The learning rate determines how quickly or slowly a neural network model adapts to a given situation and learns. A higher learning rate value indicates that the model only needs a few training epochs and produces rapid changes, whereas a lower learning rate indicates that the model MAY take a long time to converge or may never converge and BECOME stuck on a poor solution. As a result, it is recommended that a GOOD learning rate value be established by TRIAL and error rather than using a learning rate that is too low or too high.

In the above IMAGE, we can clearly see that a big learning rate leads us to move away from the desired output. However, having a small learning rate leads us to the desired output eventually.

24.

What are the applications of deep learning?

Answer»

Following are some of the applications of deep LEARNING:-

  • Pattern recognition and NATURAL language PROCESSING.
  • Recognition and processing of images.
  • Automated translation.
  • Analysis of sentiment.
  • System for answering questions.
  • Classification and DETECTION of Objects.
  • Handwriting Generation by Machine.
  • Automated text generation.
  • Colorization of Black and White images.
25.

What do you understand about Neural Networks in the context of Deep Learning?

Answer»

Neural Networks are artificial systems that have a lot of resemblance to the biological neural networks in the human body. A neural network is a set of algorithms that attempts to recognize UNDERLYING relationships in a batch of data using a method that mimics how the human brain works. Without any task-specific rules, these systems learn to do tasks by being exposed to a variety of DATASETS and examples. The NOTION is that instead of being programmed with a pre-coded understanding of these datasets, the system derives identifying TRAITS from the data it is fed to. Neural networks are built on threshold logic computational models.  Because neural networks can adapt to changing input, they can produce the best possible outcome without requiring the output CRITERIA to be redesigned.