Machine learning Mcq Questions with Answers

Question 1
Consider the problem of classifying whether the given fruit is a muskmelon (Label:1) or a mango (Label:0). The following table shows the weight values collected for 5 muskmelons and 5 mangos. Choose which of the following threshold value solves this problem with zero classification error?

 Weight (x) Fruit label (y) 3.5310.53.13.21.81.72.95 1100110010

A) 2
B) 2.5
C) 3
D) Doesn't exist

Question 2
Assume that a student blindly (i.e., without looking for any specific genre) gathers tons of e-books from his friends over months and store them in a single directory. After about machine learning algorithms, she decided to group the books into multiple categories by reading the contents of the books one by one automatically. Which of the following algorithm is more suitable for her?

A) Supervised classification algorithms
B) Unsupervised clustering algorithms
C) Supervised regression algorithms
D) No ML algorithm is suitable to solve this problem

Question
Common Data for the questions 3 to 7

A sugar patient records his daily blood glucose levels in the morning before taking a breakfast. He continued to do that for an year. The table below shows a few samples of the dataset with food items he had on the previous nights. He wanted to predict his daily glucose level using this data

 Idli (x₁) Dosa (x₂) Ice cream (x₃) Slept by 11 pm(x₄) Level (mg/dL) :101: :142: :10.50: :1(Yes)1(Yes)-1(No): :113.4106.192.2:

Question 3
Which of the following machine learning approach is more suitable to satisfy his objective?
A) Classification
B) Regression
C) Clustering
D) Recommendation

Question 4
The dimension( d in R^d ) of the feature space x is?
A) 1
B) 2
C) 3
D) 4

Question 5
The dimension( d in R^d ) of the label space y is?
A) 1
B) 2
C) 3
D) 4

Question 6
Suppose the trained model f(x) is given as 5.8x₁+ 3.4x₂ + 20.9x₃ + 1.2x₄ + 79.8. One fine day, he had three idli, two dosa, one icecrearn and continued working beyond 11 p.m. to submit the Machine Learning assignment.What will be his glucose level the next day as predicted by the model?
A) 120.1
B) 123.5
C) 125.9
D) 129.6

Question 7
After noting the predicted value by the model, he decided to measure the glucose level using a Glucometer. The meter reads 120 mg/dL, then what is the squared error between the prediction and the actual value?
A) 05.5
B) 10.8
C) 12.5
D) 16.8

Question
Common data for the questions 8 and 9

Consider a function f(w) = 0.1w² + w - 1. Suppose that we use a gradient descent algorithm to find the minima of the function iteratively.

Question 8
With the initial guess for the w to be w=10 and taking the step size Î·= 0.5, the value of w after the third iteration is?
A) 5.8
B) 7.6
C) 8.3
D) 9.5

Question 9
How far the current weight value (after 3 iterations) from the minima of the function?
A) 8
B) 10
C) 12
D) 14

Question 10
Consider the following statements point
A) Neural networks can separate non-linear data points (i.e., data points which can't be separated by a line)
B) It uses back-propagation algorithm to iteratively update the weights (parameters) of the network.

Select the correct option about these statements
A) Only A is true
B) Only B is true
C) Both A and B are true
D) Neither A nor B is true

ó „›᠋Question
Common Data for questions 11 to 13

The data points (x∈R) in the table below are to be grouped into 2 clusters, namely, (z₁ , z₂). Each data point is randomly assigned to a cluster as shown in the table, what is the mean (average) value of z₁ and z₂?

 Weight (x) Cluster (z) 3.5310.53.13.21.81.72.95 z₁z₂z₂z₂z₁z₂z₂z₂z₁z₂

Question 11
The table shows one way of grouping data points into 2 clusters. In how many ways the samples can be grouped into two clusters? (Note: A cluster can be empty (i.e., no samples assigned to it)
A) 1000
B) 1024
C) 1200
D) 1250

Question 12
The mean value of z₁:
A) 1
B) 2
C) 3
D) 4

Question 13
The mean value of z₂:
A) 1
B) 2.2
C) 3
D) 4

Question 14
Which of the following algorithm helps reducing the dimension of data points from R^d to R^k , such that k << d ?
A) K-means clustering
B) KNN classifier
C) principal Component Analysis (PCA)
D) Perceptron

Question 15
Consider a column vector x of size 10 x 1. Then the product of xx^T will be a?
A) scalar
B) vector
C) matrix
D) not possible

Question 16
Which of the following are a binary classification problem?(more than one correct)
A) Categorize whether a tweet in twitter is violent or nonviolent
B) Predict whether an upcoming movie will be successful or not
C) Recognize whether a digit in an image is zero or nine
D) Given a speech waveform (i.e, audio), identify whether the speaker is a male or female

Question 17

Consider a problem of classifying whether a given fruit is a Muskmelon (Label:1) or a Mango (Label:0). The following table shows the weight values collected for 5 Muskmelons and 5 Mangoes. Suppose that two K-NN classifiers,namely, C₁ and C₂ are used for classification. Assume that k =1 for classifier C₁ and k= 3 for classifier C₂. Further, both the classifiers use the following distance formula d = |x^t - x^i| where i = 1, 2, 3,... 10. Then the classifiers classify the given test point x^t=2.5 as (O₁ ,O₂), where O₁ is classification by C₁ and O₂ is classification by C₂
 Weight (x) Fruit labels (y) 3.5310.53.13.21.81.72.95 1000110010

A) 1,1
B) 1,0
C) 0,1
D) 0,0

Question 18
use decision tree classification algorithm by setting the threshold for the feature x < 2.5 in the root node. What is the information gain after the first split (i.e.,level-1 of the tree)?. Let log₂ (0) be defined as zero
A) 0.41
B) 1.9
C) 2.3
D) 2.9

Question 19
Consider the following statements
A. Discriminative models learns a probability mapping from feature x to label y.
B. Generative models a joint probability distribution of feature x and label y
C. Discriminative models do not learn about a probability distribution from which the features are drawn.

Which of these statements are True?
A) A and B
B) only A
C) All the statements: A,B and C
D) B and C

Question 20
Given the learned weight vector w, select the statement that relates the line that separates the data points (i.e.,decision line) and the weight vector?
A) The angle between w and decision line is not 90°
B) It is always parallel to w
C) It can be orientated in any direction with respect to w based on the data points
D) It is always perpendicular to w

Question 21
The Sigmoid function is a probabilistic function and hence the output is always between (0,1). The statement is
A) true
B) false

Question 22
Suppose the Perceptron model is used for a binary classification problem. Suppose further that all the data points in the data set are correctly classified for some w.Then using the Perceptron learning algorithm to update the w will no longer modify the elements in the w. The statement is
A) Always False
B) Always True
C)Not always True

Question 23
Suppose that we given a dataset for a binary classification problem with m data points. Half Of the data points belong to the positive class  (i.e., y = 1) and the rest Of the data points belong to the negative class(i.e., y =-1) Somehow, we learned that the data points are not linearly separable.Then the statement: "Using perceptron algorithm to classify the data points with zero error is not possible" is?
A) true
B) false