Mindblown: a blog about philosophy.
-
Q.2 How will you measure the Euclidean distance between the two arrays in numpy?
Ans. In order to measure the Euclidean distance between the two arrays, we will first initialize our two arrays, then we will use the linalg.norm() function provided by the numpy library. Here, numpy is imported as np. a = np.array([1,2,3,4,5]) b = np.array([6,7,8,9,10]) # Solution e_dist = np.linalg.norm(a-b) e_dist 11.180339887498949 With data integrity, we can define…
-
Python Data Science Interview
Q.1 What is a lambda expression in Python? Ans. With the help of lambda expression, you can create an anonymous function. Unlike conventional functions, lambda functions occupy a single line of code. The basic syntax of a lambda function is – lambda arguments: expression An example of lambda function in Python data science is – x =…
-
Q.14 What is SVM? Can you name some kernels used in SVM?
SVM stands for support vector machine. They are used for classification and prediction tasks. SVM consists of a separating plane that discriminates between the two classes of variables. This separating plane is known as hyperplane. Some of the kernels used in SVM are –
-
Q.13 Explain bias, variance tradeoff.
Bias leads to a phenomenon called underfitting. This is caused by the introduction of error due to the oversimplification of the model. On the contrary, variance occurs due to complexity in the machine learning algorithm. In variance, the model also learns noise and other distortions that affect the overall performance of it. If you increase…
-
Q.12 How will you create a series from a given list in Pandas?
We will the list to the Series() function. ser1 = pd.Series(mylist)
-
Q.11 Why is Naive Bayes referred to as Naive?
Ans. In Naive Bayes, the assumptions and probabilities that are computed of the features are independent of each other. It is the assumption of feature independence that makes Naive Bayes, “Naive”.
-
Q.10 How is AUC different from ROC?
AUC curve is a measurement of precision against the recall. Precision = TP/(TP + FP) and TP/(TP + FN). This is in contrast with ROC that measures and plots True Positive against False positive rate.
-
Q.9 Explain ROC curve.
Receiver Operating Characteristic is a measurement of the True Positive Rate (TPR) against False Positive Rate (FPR). We calculate True Positive (TP) as TPR = TP/ (TP + FN). On the contrary, false positive rate is determined as FPR = FP/FP+TN where where TP = true positive, TN = true negative, FP = false positive,…
-
Q.8 How can you convert date-strings to timeseries in a series?
Input: s = pd.Series([’02 Feb 2011′, ’02-02-2013′, ‘20160104’, ‘2011/01/04’, ‘2014-12-05’, ‘2010-06-06T12:05]) To solve this, we will use the to_datetime() function. pd.to_datetime(s)
-
Q.7 Can you stack two series horizontally? If so, how?
Yes, we can stack the two series horizontally using concat() function and setting axis = 1. df = pd.concat([s1, s2], axis=1)
Got any book recommendations?