Category: Interview Part 1

Q.14 What is SVM? Can you name some kernels used in SVM?

SVM stands for support vector machine. They are used for classification and prediction tasks. SVM consists of a separating plane that discriminates between the two classes of variables. This separating plane is known as hyperplane. Some of the kernels used in SVM are –

March 30, 2023
Q.13 Explain bias, variance tradeoff.

Bias leads to a phenomenon called underfitting. This is caused by the introduction of error due to the oversimplification of the model. On the contrary, variance occurs due to complexity in the machine learning algorithm. In variance, the model also learns noise and other distortions that affect the overall performance of it. If you increase…

March 30, 2023
Q.12 How will you create a series from a given list in Pandas?

We will the list to the Series() function. ser1 = pd.Series(mylist)

March 30, 2023
Q.11 Why is Naive Bayes referred to as Naive?

Ans. In Naive Bayes, the assumptions and probabilities that are computed of the features are independent of each other. It is the assumption of feature independence that makes Naive Bayes, “Naive”.

March 30, 2023
Q.10 How is AUC different from ROC?

AUC curve is a measurement of precision against the recall. Precision = TP/(TP + FP) and TP/(TP + FN). This is in contrast with ROC that measures and plots True Positive against False positive rate.

March 30, 2023
Q.9 Explain ROC curve.

Receiver Operating Characteristic is a measurement of the True Positive Rate (TPR) against False Positive Rate (FPR). We calculate True Positive (TP) as TPR = TP/ (TP + FN). On the contrary, false positive rate is determined as FPR = FP/FP+TN where where TP = true positive, TN = true negative, FP = false positive,…

March 30, 2023
Q.8 How can you convert date-strings to timeseries in a series?

Input: s = pd.Series([’02 Feb 2011′, ’02-02-2013′, ‘20160104’, ‘2011/01/04’, ‘2014-12-05’, ‘2010-06-06T12:05]) To solve this, we will use the to_datetime() function. pd.to_datetime(s)

March 30, 2023
Q.7 Can you stack two series horizontally? If so, how?

Yes, we can stack the two series horizontally using concat() function and setting axis = 1. df = pd.concat([s1, s2], axis=1)

March 30, 2023
Q.6 How are KNN and K-means clustering different?

Firstly, KNN is a supervised learning algorithm. In order to train this algorithm, we require labeled data. K-means is an unsupervised learning algorithm that looks for patterns that are intrinsic to the data. The K in KNN is the number of nearest data points. On the contrary, the K in K-means specify the number of…

March 30, 2023
Q.5 How to find the positions of numbers that are multiples of 4 from a series?

For finding the multples of 4, we will use the argwhere() function. First, we will create a list of 10 numbers – s1 = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) np.argwhere(ser % 4==0) Output > [3], [7]

March 30, 2023