InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
Explain the Difference Between Classification and Regression? |
|
Answer» Classification is used to produce discrete results, classification is used to classify data into some SPECIFIC categories. WHEREAS, regression deals with continuous data. Classification is used to predict the output into a GROUP of CLASSES. Whereas, regression is used to predict the relationship that data REPRESENTS. |
|
| 2. |
What is Bias in Machine Learning? |
|
Answer» BIAS in DATA tells us there is inconsistency in data. The inconsistency may occur for SEVERAL reasons which are not mutually exclusive. For example, a tech giant LIKE Amazon to SPEED the hiring process they build one engine where they are going to give 100 resumes, it will spit out the top five, and hire those. When the company realized the software was not producing gender-neutral results it was tweaked to remove this bias. |
|
| 3. |
What are Different Kernels in SVM? |
|
Answer» There are six types of kernels in SVM:
|
|
| 4. |
What are Support Vectors in SVM? |
|
Answer» A Support VECTOR Machine (SVM) is an algorithm that TRIES to FIT a LINE (or plane or hyperplane) between the different CLASSES that maximizes the distance from the line to the points of the classes. In this way, it tries to find a robust separation between the classes. The Support Vectors are the points of the edge of the dividing hyperplane as in the below figure. Support Vector Machine (SVM) |
|
| 5. |
Explain SVM Algorithm in Detail |
|
Answer» A SUPPORT Vector Machine (SVM) is a very powerful and versatile SUPERVISED machine learning model, capable of performing linear or non-linear classification, regression, and even outlier detection. Suppose we have given some DATA points that each belong to one of two classes, and the goal is to separate two classes based on a set of examples. In SVM, a data point is viewed as a p-dimensional vector (a list of p numbers), and we wanted to know whether we can separate such points with a (p-1)-dimensional hyperplane. This is called a linear classifier. There are many hyperplanes that classify the data. To choose the best hyperplane that represents the LARGEST separation or margin between the two classes. We have data (x1, Y1), ..., (xn, yn), and different features (xii, ..., xip), and yiis either 1 or -1. The equation of the hyperplane H3 is the set of points satisfying: w. x-b = 0 Where w is the normal vector of the hyperplane. The parameter b||w||determines the offset of the hyperplane from the original along the normal vector w So for each i, either xiis in the hyperplane of 1 or -1. Basically, xisatisfies: w . xi - b = 1 or w. xi - b = -1 Support Vector Machine (SVM) |
|
| 6. |
What is PCA? When do you use it? |
|
Answer» Principal component analysis (PCA) is most commonly USED for dimension reduction. In this case, PCA measures the VARIATION in each variable (or column in the table). If there is little variation, it throws the variable out, as illustrated in the figure below: Principal component analysis (PCA)Thus making the DATASET easier to visualize. PCA is used in FINANCE, neuroscience, and pharmacology. It is very useful as a preprocessing STEP, especially when there are linear correlations between features. |
|
| 7. |
What is ‘Naive’ in a Naive Bayes? |
|
Answer» The Naive Bayes METHOD is a supervised learning algorithm, it is naive since it makes assumptions by applying Bayes’ theorem that all attributes are independent of each other. Bayes’ theorem states the following relationship, given class VARIABLE y and dependent vector x1 through xn: P(yi | x1,..., xn) =P(yi)P(x1,..., xn | yi)(P(x1,..., xn) Using the naive CONDITIONAL independence assumption that each xiis independent: for all I this relationship is simplified to: P(XI | yi, x1, ..., xi-1, xi+1, ...., xn) = P(xi | yi) Since, P(x1,..., xn) is a constant given the input, we can use the following classification rule: P(yi | x1, ..., xn) = P(y) ni=1P(xi | yi)P(x1,...,xn) and we can also use Maximum A Posteriori (MAP) estimation to estimate P(yi)and P(yi | xi) the former is then the relative frequency of class yin the training set. P(yi | x1,..., xn) P(yi) ni=1P(xi | yi) y = arg max P(yi)ni=1P(xi | yi) The different naive Bayes classifiers mainly differ by the assumptions they make regarding the DISTRIBUTION of P(yi | xi): can be Bernoulli, binomial, Gaussian, and so on. |
|
| 8. |
What is Unsupervised Learning? |
|
Answer» Unsupervised learning is also a type of machine learning algorithm used to find patterns on the set of data given. In this, we don’t have any DEPENDENT variable or label to predict. Unsupervised Learning Algorithms:
In the same example, a T-shirt clustering will CATEGORIZE as “collar STYLE and V neck style”, “crew neck style” and “sleeve types”. |
|
| 9. |
What is Supervised Learning? |
|
Answer» Supervised LEARNING is a machine learning algorithm of inferring a function from labeled training data. The training data consists of a set of training EXAMPLES. Example: 01 Knowing the height and weight identifying the gender of the person. Below are the popular supervised learning algorithms.
Example: 02 If you build a T-shirt CLASSIFIER, the labels will be “this is an S, this is an M and this is L”, BASED on showing the classifier examples of S, M, and L. |
|
| 10. |
What are Different Types of Machine Learning algorithms? |
|
Answer» There are various types of machine learning algorithms. Here is the LIST of them in a broad category based on:
|
|
| 11. |
Why was Machine Learning Introduced? |
|
Answer» The simplest answer is to make our lives easier. In the early days of “INTELLIGENT” APPLICATIONS, many systems used hardcoded rules of “if” and “else” decisions to process data or adjust the user input. Think of a spam filter whose job is to move the appropriate incoming email messages to a spam FOLDER. But with the machine learning algorithms, we are given ample information for the data to learn and identify the patterns from the data. Unlike the NORMAL problems we don’t need to write the new rules for each problem in machine learning, we just need to use the same workflow but with a different dataset. Let’s talk about Alan Turing, in his 1950 paper, “Computing Machinery and Intelligence”, Alan asked, “Can machines think?” Full paper here The paper describes the “Imitation Game”, which includes three participants -
The judge asks the other two participants to talk. While they respond the judge needs to decide which response came from the computer. If the judge could not tell the difference the computer won the game. The test continues today as an annual competition in artificial intelligence. The aim is simple enough: convince the judge that they are chatting to a human instead of a computer chatbot program. |
|