11 + Interview Questions in Machine Learning Interview Questions For Freshers in Machine Learning Interview Questions

1.	Explain the Difference Between Classification and Regression?
Answer» Classification is used to produce discrete results, classification is used to classify data into some SPECIFIC categories. For example, classifying emails into spam and non-spam categories. WHEREAS, regression deals with continuous data. For example, predicting stock prices at a certain point in time. Classification is used to predict the output into a GROUP of CLASSES. For example, Is it Hot or Cold tomorrow? Whereas, regression is used to predict the relationship that data REPRESENTS. For example, What is the temperature tomorrow?

1.

Explain the Difference Between Classification and Regression?

Answer»

Classification is used to produce discrete results, classification is used to classify data into some SPECIFIC categories.
For example, classifying emails into spam and non-spam categories.

WHEREAS, regression deals with continuous data.
For example, predicting stock prices at a certain point in time.

Classification is used to predict the output into a GROUP of CLASSES.
For example, Is it Hot or Cold tomorrow?

Whereas, regression is used to predict the relationship that data REPRESENTS.
For example, What is the temperature tomorrow?

Discussion

2.	What is Bias in Machine Learning?
Answer» BIAS in DATA tells us there is inconsistency in data. The inconsistency may occur for SEVERAL reasons which are not mutually exclusive. For example, a tech giant LIKE Amazon to SPEED the hiring process they build one engine where they are going to give 100 resumes, it will spit out the top five, and hire those. When the company realized the software was not producing gender-neutral results it was tweaked to remove this bias.

2.

What is Bias in Machine Learning?

Answer»

BIAS in DATA tells us there is inconsistency in data. The inconsistency may occur for SEVERAL reasons which are not mutually exclusive.

For example, a tech giant LIKE Amazon to SPEED the hiring process they build one engine where they are going to give 100 resumes, it will spit out the top five, and hire those.

When the company realized the software was not producing gender-neutral results it was tweaked to remove this bias.

Discussion

3.	What are Different Kernels in SVM?
Answer» There are six types of kernels in SVM: Linear kernel - used when data is linearly separable. POLYNOMIAL kernel - When you have DISCRETE data that has no natural notion of smoothness. Radial basis kernel - CREATE a decision boundary able to do a much better job of separating two classes than the linear kernel. Sigmoid kernel - used as an activation FUNCTION for NEURAL networks.

3.

What are Different Kernels in SVM?

Answer»

There are six types of kernels in SVM:

Linear kernel - used when data is linearly separable.
POLYNOMIAL kernel - When you have DISCRETE data that has no natural notion of smoothness.
Radial basis kernel - CREATE a decision boundary able to do a much better job of separating two classes than the linear kernel.
Sigmoid kernel - used as an activation FUNCTION for NEURAL networks.

Discussion

4.	What are Support Vectors in SVM?
Answer» A Support VECTOR Machine (SVM) is an algorithm that TRIES to FIT a LINE (or plane or hyperplane) between the different CLASSES that maximizes the distance from the line to the points of the classes. In this way, it tries to find a robust separation between the classes. The Support Vectors are the points of the edge of the dividing hyperplane as in the below figure. Support Vector Machine (SVM)

4.

What are Support Vectors in SVM?

Answer»

A Support VECTOR Machine (SVM) is an algorithm that TRIES to FIT a LINE (or plane or hyperplane) between the different CLASSES that maximizes the distance from the line to the points of the classes.

In this way, it tries to find a robust separation between the classes. The Support Vectors are the points of the edge of the dividing hyperplane as in the below figure.

Support Vector Machine (SVM)

Discussion

5.	Explain SVM Algorithm in Detail
Answer» A SUPPORT Vector Machine (SVM) is a very powerful and versatile SUPERVISED machine learning model, capable of performing linear or non-linear classification, regression, and even outlier detection. Suppose we have given some DATA points that each belong to one of two classes, and the goal is to separate two classes based on a set of examples. In SVM, a data point is viewed as a p-dimensional vector (a list of p numbers), and we wanted to know whether we can separate such points with a (p-1)-dimensional hyperplane. This is called a linear classifier. There are many hyperplanes that classify the data. To choose the best hyperplane that represents the LARGEST separation or margin between the two classes. If such a hyperplane exists, it is known as a maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier. The best hyperplane that divides the data in H3 We have data (x1, Y1), ..., (xn, yn), and different features (xii, ..., xip), and yiis either 1 or -1. The equation of the hyperplane H3 is the set of points satisfying: w. x-b = 0 Where w is the normal vector of the hyperplane. The parameter b\|\|w\|\|determines the offset of the hyperplane from the original along the normal vector w So for each i, either xiis in the hyperplane of 1 or -1. Basically, xisatisfies: w . xi - b = 1 or w. xi - b = -1 Support Vector Machine (SVM)

5.

Explain SVM Algorithm in Detail

Answer»

A SUPPORT Vector Machine (SVM) is a very powerful and versatile SUPERVISED machine learning model, capable of performing linear or non-linear classification, regression, and even outlier detection.

Suppose we have given some DATA points that each belong to one of two classes, and the goal is to separate two classes based on a set of examples.

In SVM, a data point is viewed as a p-dimensional vector (a list of p numbers), and we wanted to know whether we can separate such points with a (p-1)-dimensional hyperplane. This is called a linear classifier.

There are many hyperplanes that classify the data. To choose the best hyperplane that represents the LARGEST separation or margin between the two classes.
If such a hyperplane exists, it is known as a maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier. The best hyperplane that divides the data in H3

We have data (x1, Y1), ..., (xn, yn), and different features (xii, ..., xip), and yiis either 1 or -1.

The equation of the hyperplane H3 is the set of points satisfying:

w. x-b = 0

Where w is the normal vector of the hyperplane. The parameter b||w||determines the offset of the hyperplane from the original along the normal vector w

So for each i, either xiis in the hyperplane of 1 or -1. Basically, xisatisfies:

w . xi - b = 1 or w. xi - b = -1

Support Vector Machine (SVM)

Discussion

6.	What is PCA? When do you use it?
Answer» Principal component analysis (PCA) is most commonly USED for dimension reduction. In this case, PCA measures the VARIATION in each variable (or column in the table). If there is little variation, it throws the variable out, as illustrated in the figure below: Principal component analysis (PCA) Thus making the DATASET easier to visualize. PCA is used in FINANCE, neuroscience, and pharmacology. It is very useful as a preprocessing STEP, especially when there are linear correlations between features.

6.

What is PCA? When do you use it?

Answer»

Principal component analysis (PCA) is most commonly USED for dimension reduction.

In this case, PCA measures the VARIATION in each variable (or column in the table). If there is little variation, it throws the variable out, as illustrated in the figure below:

Principal component analysis (PCA)

Thus making the DATASET easier to visualize. PCA is used in FINANCE, neuroscience, and pharmacology.

It is very useful as a preprocessing STEP, especially when there are linear correlations between features.

Discussion

7.	What is ‘Naive’ in a Naive Bayes?
Answer» The Naive Bayes METHOD is a supervised learning algorithm, it is naive since it makes assumptions by applying Bayes’ theorem that all attributes are independent of each other. Bayes’ theorem states the following relationship, given class VARIABLE y and dependent vector x1 through xn: P(yi \| x1,..., xn) =P(yi)P(x1,..., xn \| yi)(P(x1,..., xn) Using the naive CONDITIONAL independence assumption that each xiis independent: for all I this relationship is simplified to: P(XI \| yi, x1, ..., xi-1, xi+1, ...., xn) = P(xi \| yi) Since, P(x1,..., xn) is a constant given the input, we can use the following classification rule: P(yi \| x1, ..., xn) = P(y) ni=1P(xi \| yi)P(x1,...,xn) and we can also use Maximum A Posteriori (MAP) estimation to estimate P(yi)and P(yi \| xi) the former is then the relative frequency of class yin the training set. P(yi \| x1,..., xn) P(yi) ni=1P(xi \| yi) y = arg max P(yi)ni=1P(xi \| yi) The different naive Bayes classifiers mainly differ by the assumptions they make regarding the DISTRIBUTION of P(yi \| xi): can be Bernoulli, binomial, Gaussian, and so on.

7.

What is ‘Naive’ in a Naive Bayes?

Answer»

The Naive Bayes METHOD is a supervised learning algorithm, it is naive since it makes assumptions by applying Bayes’ theorem that all attributes are independent of each other.

Bayes’ theorem states the following relationship, given class VARIABLE y and dependent vector x1 through xn:

P(yi | x1,..., xn) =P(yi)P(x1,..., xn | yi)(P(x1,..., xn)

Using the naive CONDITIONAL independence assumption that each xiis independent: for all I this relationship is simplified to:

P(XI | yi, x1, ..., xi-1, xi+1, ...., xn) = P(xi | yi)

Since, P(x1,..., xn) is a constant given the input, we can use the following classification rule:

P(yi | x1, ..., xn) = P(y) ni=1P(xi | yi)P(x1,...,xn) and we can also use Maximum A Posteriori (MAP) estimation to estimate P(yi)and P(yi | xi) the former is then the relative frequency of class yin the training set.

P(yi | x1,..., xn) P(yi) ni=1P(xi | yi)

y = arg max P(yi)ni=1P(xi | yi)

The different naive Bayes classifiers mainly differ by the assumptions they make regarding the DISTRIBUTION of P(yi | xi): can be Bernoulli, binomial, Gaussian, and so on.

Discussion

8.	What is Unsupervised Learning?
Answer» Unsupervised learning is also a type of machine learning algorithm used to find patterns on the set of data given. In this, we don’t have any DEPENDENT variable or label to predict. Unsupervised Learning Algorithms: Clustering, Anomaly Detection, Neural Networks and Latent Variable Models. EXAMPLE: In the same example, a T-shirt clustering will CATEGORIZE as “collar STYLE and V neck style”, “crew neck style” and “sleeve types”.

8.

What is Unsupervised Learning?

Answer»

Unsupervised learning is also a type of machine learning algorithm used to find patterns on the set of data given. In this, we don’t have any DEPENDENT variable or label to predict. Unsupervised Learning Algorithms:

Clustering,
Anomaly Detection,
Neural Networks and Latent Variable Models.

EXAMPLE:

In the same example, a T-shirt clustering will CATEGORIZE as “collar STYLE and V neck style”, “crew neck style” and “sleeve types”.

Discussion

9.	What is Supervised Learning?
Answer» Supervised LEARNING is a machine learning algorithm of inferring a function from labeled training data. The training data consists of a set of training EXAMPLES. Example: 01 Knowing the height and weight identifying the gender of the person. Below are the popular supervised learning algorithms. SUPPORT Vector Machines Regression Naive Bayes Decision Trees K-nearest Neighbour Algorithm and Neural Networks. Example: 02 If you build a T-shirt CLASSIFIER, the labels will be “this is an S, this is an M and this is L”, BASED on showing the classifier examples of S, M, and L.

9.

What is Supervised Learning?

Answer»

Supervised LEARNING is a machine learning algorithm of inferring a function from labeled training data. The training data consists of a set of training EXAMPLES.

Example: 01

Knowing the height and weight identifying the gender of the person. Below are the popular supervised learning algorithms.

SUPPORT Vector Machines
Regression
Naive Bayes
Decision Trees
K-nearest Neighbour Algorithm and Neural Networks.

Example: 02

If you build a T-shirt CLASSIFIER, the labels will be “this is an S, this is an M and this is L”, BASED on showing the classifier examples of S, M, and L.

Discussion

10.	What are Different Types of Machine Learning algorithms?
Answer» There are various types of machine learning algorithms. Here is the LIST of them in a broad category based on: Whether they are trained with human supervision (SUPERVISED, UNSUPERVISED, reinforcement learning) The criteria in the below diagram are not EXCLUSIVE, we can COMBINE them any way we like. Types of Machine Learning algorithms

10.

What are Different Types of Machine Learning algorithms?

Answer»

There are various types of machine learning algorithms. Here is the LIST of them in a broad category based on:

Whether they are trained with human supervision (SUPERVISED, UNSUPERVISED, reinforcement learning)
The criteria in the below diagram are not EXCLUSIVE, we can COMBINE them any way we like.

Types of Machine Learning algorithms

Discussion

11.	Why was Machine Learning Introduced?
Answer» The simplest answer is to make our lives easier. In the early days of “INTELLIGENT” APPLICATIONS, many systems used hardcoded rules of “if” and “else” decisions to process data or adjust the user input. Think of a spam filter whose job is to move the appropriate incoming email messages to a spam FOLDER. But with the machine learning algorithms, we are given ample information for the data to learn and identify the patterns from the data. Unlike the NORMAL problems we don’t need to write the new rules for each problem in machine learning, we just need to use the same workflow but with a different dataset. Let’s talk about Alan Turing, in his 1950 paper, “Computing Machinery and Intelligence”, Alan asked, “Can machines think?” Full paper here The paper describes the “Imitation Game”, which includes three participants - Human acting as a judge, Another human, and A computer is an ATTEMPT to convince the judge that it is human. The judge asks the other two participants to talk. While they respond the judge needs to decide which response came from the computer. If the judge could not tell the difference the computer won the game. The test continues today as an annual competition in artificial intelligence. The aim is simple enough: convince the judge that they are chatting to a human instead of a computer chatbot program.

11.

Why was Machine Learning Introduced?

Answer»

The simplest answer is to make our lives easier. In the early days of “INTELLIGENT” APPLICATIONS, many systems used hardcoded rules of “if” and “else” decisions to process data or adjust the user input. Think of a spam filter whose job is to move the appropriate incoming email messages to a spam FOLDER.

But with the machine learning algorithms, we are given ample information for the data to learn and identify the patterns from the data.

Unlike the NORMAL problems we don’t need to write the new rules for each problem in machine learning, we just need to use the same workflow but with a different dataset.

Let’s talk about Alan Turing, in his 1950 paper, “Computing Machinery and Intelligence”, Alan asked, “Can machines think?”

Full paper here

The paper describes the “Imitation Game”, which includes three participants -

Human acting as a judge,
Another human, and
A computer is an ATTEMPT to convince the judge that it is human.

The judge asks the other two participants to talk. While they respond the judge needs to decide which response came from the computer. If the judge could not tell the difference the computer won the game.

The test continues today as an annual competition in artificial intelligence. The aim is simple enough: convince the judge that they are chatting to a human instead of a computer chatbot program.

Discussion

Explore topic-wise InterviewSolutions in Current Affairs.

Explain the Difference Between Classification and Regression?

What is Bias in Machine Learning?

What are Different Kernels in SVM?

What are Support Vectors in SVM?

Explain SVM Algorithm in Detail

What is PCA? When do you use it?

What is ‘Naive’ in a Naive Bayes?

What is Unsupervised Learning?

What is Supervised Learning?

What are Different Types of Machine Learning algorithms?

Why was Machine Learning Introduced?