InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
Difference Between Sigmoid and Softmax functions? |
|
Answer» The sigmoid function is USED for BINARY classification. The probabilities sum needs to be 1. Whereas, Softmax function is used for multi-classification. The probabilities sum will be 1. ConclusionThe above-listed questions are the basics of machine learning. Machine learning is advancing so fast hence new concepts will emerge. So to get up to date with that join communities, attend conferences, read research papers. By doing so you can CRACK any ML interview. Additional Resources Practice Coding Best Machine Learning Courses Best Data Science Courses Python Interview Questions AI MCQs Machine Learning Engineer: CAREER Guide Deep Learning Interview Machine Learning Engineer Salary Machine Learning VS Data Science Machine Learning Vs Deep Learning Difference Between Artificial Intelligence and Machine Learning |
|
| 2. |
What is Reinforcement Learning? |
|
Answer» REINFORCEMENT LEARNING is DIFFERENT from the other TYPES of learning like supervised and unsupervised. In reinforcement learning, we are given neither data nor labels. Our learning is based on the rewards given to the agent by the environment. |
|
| 3. |
What are Parametric and Non-Parametric Models? |
|
Answer» Parametric MODELS will have limited parameters and to PREDICT new data, you only NEED to know the parameter of the model. Non-Parametric models have no limits in TAKING a number of parameters, allowing for more FLEXIBILITY and to predict new data. You need to know the state of the data and model parameters. |
|
| 4. |
What is P-value? |
|
Answer» P-values are used to MAKE a decision about a hypothesis test. P-value is the minimum SIGNIFICANT level at which you can reject the NULL hypothesis. The LOWER the p-value, the more LIKELY you reject the null hypothesis. |
|
| 5. |
Explain Correlation and Covariance? |
|
Answer» Correlation is used for measuring and also for estimating the quantitative relationship between TWO variables. Correlation measures how STRONGLY two variables are related. Examples like, income and EXPENDITURE, demand and supply, etc. Covariance is a simple way to measure the correlation between two variables. The problem with covariance is that they are hard to COMPARE without NORMALIZATION. |
|
| 6. |
Can logistic regression use for more than 2 classes? |
|
Answer» No, by DEFAULT logistic regression is a BINARY classifier, so it cannot be applied to more than 2 classes. However, it can be EXTENDED for SOLVING multi-class classification problems (multinomial logistic regression) |
|
| 7. |
How do check the Normality of a dataset? |
|
Answer» Visually, we can use plots. A few of the NORMALITY checks are as follows:
|
|
| 8. |
What are Recommender Systems? |
|
Answer» A recommendation engine is a system USED to predict users’ interests and RECOMMEND products that are quite likely interesting for them. Data required for recommender SYSTEMS stems from explicit user ratings after watching a film or listening to a song, from implicit search engine QUERIES and purchase histories, or from other KNOWLEDGE about the users/items themselves. |
|
| 9. |
How can you select K for K-means Clustering? |
|
Answer» There are two kinds of methods that INCLUDE direct methods and statistical testing methods:
The silhouette is the most frequently used while DETERMINING the optimal VALUE of K. |
|
| 10. |
What is Clustering? |
|
Answer» Clustering is the process of GROUPING a set of objects into a NUMBER of groups. Objects should be SIMILAR to one another within the same cluster and dissimilar to those in other clusters. A few TYPES of clustering are:
|
|
| 11. |
What is Collaborative Filtering? And Content-Based Filtering? |
|
Answer» Collaborative FILTERING is a proven technique for personalized CONTENT recommendations. Collaborative filtering is a type of recommendation system that predicts new content by matching the interests of the individual user with the preferences of many users. Content-based recommender systems are focused only on the preferences of the user. New recommendations are MADE to the user from similar content ACCORDING to the user’s previous choices. Collaborative Filtering and Content-Based Filtering |
|
| 12. |
What is a Random Forest? How does it work? |
|
Answer» Random forest is a versatile MACHINE learning method capable of performing both regression and classification tasks. Like bagging and boosting, random forest WORKS by combining a set of other tree models. Random forest builds a tree from a random sample of the columns in the test data. Here’s are the steps how a random forest creates the trees:
|
|
| 13. |
How to Handle Outlier Values? |
|
Answer» An OUTLIER is an observation in the dataset that is far away from other observations in the dataset. Tools used to DISCOVER outliers are
Typically, we need to follow three SIMPLE strategies to handle outliers: |
|
| 14. |
How do you make sure which Machine Learning Algorithm to use? |
|
Answer» It completely DEPENDS on the dataset we have. If the data is discrete we use SVM. If the dataset is continuous we use linear regression. So there is no specific way that lets us know which ML algorithm to use, it all depends on the exploratory data analysis (EDA). EDA is LIKE “interviewing” the dataset; As part of our INTERVIEW we do the following:
Based on the above observations select one best-fit algorithm for a particular dataset. |
|
| 15. |
What is Ensemble learning? |
|
Answer» ENSEMBLE learning is a method that combines multiple machine learning models to create more powerful models. There are many reasons for a model to be different. Few reasons are:
When working with the model’s training and testing DATA, we will experience an error. This error might be bias, variance, and irreducible error. Now the model should always have a BALANCE between bias and variance, which we call a bias-variance trade-off. This ensemble learning is a way to perform this trade-off. There are many ensemble techniques available but when aggregating multiple models there are two general methods:
|
|
| 16. |
What are Loss Function and Cost Functions? Explain the key Difference Between them? |
|
Answer» When calculating loss we consider only a single data point, then we use the term loss function. Whereas, when calculating the sum of error for multiple data then we use the cost function. There is no major difference. In other words, the loss function is to capture the difference between the actual and predicted VALUES for a single record whereas cost functions aggregate the difference for the entire training dataset. The Most commonly used loss functions are Mean-squared error and Hinge loss. Mean-Squared Error(MSE): In simple words, we can say how our model predicted values against the actual values. MSE = √(predicted value - actual value)2Hinge loss: It is used to train the machine learning classifier, which is L(y) = max(0,1- yy) Where y = -1 or 1 indicating two classes and y represents the OUTPUT FORM of the classifier. The most common cost function represents the TOTAL cost as the sum of the fixed costs and the variable costs in the equation y = mx + b |
|
| 17. |
What is a Neural Network? |
|
Answer» It is a SIMPLIFIED model of the human BRAIN. Much like the brain, it has neurons that activate when encountering something similar. The DIFFERENT neurons are connected via connections that HELP information flow from one neuron to ANOTHER. |
|
| 18. |
How to Tackle Overfitting and Underfitting? |
|
Answer» Overfitting means the model FITTED to TRAINING data too well, in this case, we need to resample the data and estimate the model accuracy USING TECHNIQUES like k-fold cross-validation. Whereas for the Underfitting case we are not able to UNDERSTAND or capture the patterns from the data, in this case, we need to change the algorithms, or we need to feed more data points to the model. |
|
| 19. |
Define Precision and Recall? |
|
Answer» Precision and recall are ways of monitoring the POWER of machine learning implementation. But they often used at the same time. Precision answers the question, “Out of the ITEMS that the classifier predicted to be relevant, how many are truly relevant?” Whereas, recall answers the question, “Out of all the items that are truly relevant, how many are found by the classifier? In general, the meaning of precision is the fact of being exact and ACCURATE. So the same will go in our machine learning model as well. If you have a SET of items that your model needs to PREDICT to be relevant. How many items are truly relevant? The below figure shows the Venn diagram that precision and recall. Precision and recallMathematically, precision and recall can be defined as the following: precision = # happy correct answers/# total items returned by ranker recall = # happy correct answers/# total relevant answers |
|
| 20. |
What is F1 score? How would you use it? |
|||||||||
|
Answer» Let’s have a look at this table before directly jumping into the F1 SCORE.
In binary classification we consider the F1 score to be a measure of the MODEL’s accuracy. The F1 score is a WEIGHTED average of precision and recall SCORES. F1 = 2TP/2TP + FP + FN We see scores for F1 between 0 and 1, where 0 is the worst score and 1 is the best score. |
||||||||||