Machine learning mcqs for sppu and other university online exams and interviews are added below. The below ai ml mcq primarily focuses on Hierarchical clustering. Thus, we have added mcqs on clustering, agglomerative clustering.

These Machine learning multiple choice questions are helpful for interview of mass recruiters like tcs, capgemini, infosys and others. Happy learning. We are also planning to provide machine learning mcq pdf downloading link for free.

**Machine Learning mcq sppu**

**Q. 1 Which of the following can be used to create sub–samples using a maximum dissimilarity approach ?· **

A : minDissim

B : maxDissim

C : inmaxDissim

D : All of the Mentioned

maxDissim

**Q. 2 Movie Recommendation systems are an example of**

1.Classification

2.Clustering

3.Reinforcement Learning

4.Regression

A : 2

B : 2 and 3

C : 1 and 3

D : all of the mentioned

2 and 3

**Q. 3 Sentiment Analysis is an example of:**

1. Regression

2. Classification

3. Clustering

4. Reinforcement Learning

A : 1,2,4

B : 1

C : 1,2

D : 1,2,3

1,2,4

**Q. 4 Can decision trees be used for performing clustering?**

A : TRUE

B : FALSE

a

TRUE

**Q. 5 Which of the following is the most appropriate strategy for data cleaning before performing clustering analysis, given less than desirable number of data points: **

1. Capping and flouring of variables

2. Removal of outliers

A : 1

B : 2

C : 1 and 2

D : none of the mentioned

1

**More machine learning mcqs**

**Machine Learning mcq questions and answers **

**Q. 6 What is the minimum no. of variables/ features required to perform clustering?**

A : 0

B : 1

C : 2

D : 3

1

**Q. 7 For two runs of K-Mean clustering is it expected to get same clustering results?**

A : TRUE

B : FALSE

b

FALSE

**Q. 8 Is it possible that Assignment of observations to clusters does not change between successive iterations in K-Means**

A : yes

B : no

C : cant say

D : none of the mentioned

yes

**Q. 9 Which of the following can act as possible termination conditions in K-Means?**

1. For a fixed number of iterations.

2. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.

3. Centroids do not change between successive iterations.

4. Terminate when RSS falls below a threshold.

A : 1, 3 and 4

B : 1, 2 and 3

C : 1, 2 and 4

D : All of the above

All of the above

**Q. 10 Which of the following statement(s) correctly represents a real neuron? **

A : A neuron has a single input and a single output only.

B : A neuron has multiple inputs but a single output only

C : A neuron has a single input but multiple outputs

D : A neuron has multiple inputs and multiple outputs

e

All of the above statements are valid

**Q. 11 “Convolutional Neural Networks can perform various types of transformation (rotations or scaling) in an input”. Is the statement correct True or False? **

A : True

B : False

b

False

**Q. 12 If you increase the number of hidden layers in a Multi Layer Perceptron, the classification error of test data always decreases. True or False?**

A : True

B : False

b

False

**Q. 14 For an image recognition problem (recognizing a cat in a photo), which architecture of neural network would be better suited to solve the problem?**

A : Multi Layer Perceptron

B : Convolutional Neural Network

C : Recurrent Neural network

D : Perceptron

Convolutional Neural Network

**Q. 15 In which of the following applications can we use deep learning to solve the problem? **

A : Protein structure prediction

B : Prediction of chemical reactions

C : Detection of exotic particles

D : All of these

All of these

**Q. 16 Which of the following is (are) application(s) of deep learning? **

1. Video captioning

2. Visual question answering

3. Video summarization

4. All of the above

A : 1,2

B : 2,3

C : 1,3

D : all of these

all of these

**Q. 17 The number of nodes in the input layer is 10 and the hidden layer is 6 (In MLP). The maximum number of connections from the input layer to the hidden layer are**

A : 72

B : 60

C : More than 60

D : 120

60

**Q. 19 The goal of clustering a set of data is to**

A : divide them into groups of data that are near each other

B : choose the best data from the set

C : predict the class of data

D : determine the nearest neighbors of each of the data

divide them into groups of data that are near each other

**Q. 20 _____ is a clustering procedure where all objects start out in one giant cluster. Clusters are formed by dividing this cluster into smaller and smaller clusters.**

A : Non-hierarchical clustering

B : Divisive clustering

C : Agglomerative clustering

D : K-means clustering

Divisive clustering

**Q. 21 Which of the following is not the promise of artificial neural network?**

A : It can explain result

B : It can survive the failure of some nodes

C : It has inherent parallelism

D : It can handle noise

It can explain result

**Q. 1 Which of the following gives non-linearity to a neural network?**

A : Stochastic Gradient Descent

B : Rectified Linear Unit

C : Convolution function

D : None of the above

Rectified Linear Unit

**Q. 2 You are building a neural network where it gets input from the previous layer as well as from itself. Which of the following architecture has feedback connections?**

A : Recurrent Neural network

B : Convolutional Neural Network

C : Restricted Boltzmann Machine

D : None of these

Recurrent Neural network

**Q. 3 Suppose a convolutional neural network is trained on ImageNet dataset (Object recognition dataset). This trained model is then given a completely white image as an input. The output probabilities for this input would be equal for all classes. True or False?**

A : TRUE

B : False

b

False

**Q. 4 Which of the following clustering algorithms suffers from the problem of convergence at local optima? **

1. K- Means clustering algorithm

2. Agglomerative clustering algorithm

3. Expectation-Maximization clustering algorithm

4. Diverse clustering algorithm

A : 1 only

B : 2 and 3

C : 2 and 4

D : 1 and 3

1 and 3

**Q. 5 Which of the following algorithm is most sensitive to outliers?**

A : K-means clustering algorithm

B : K-medians clustering algorithm

C : K-modes clustering algorithm

D : K-medoids clustering algorithm

K-means clustering algorithm

**Q. 6 How can Clustering (Unsupervised Learning) be used to improve the accuracy of Linear Regression model (Supervised Learning):**

1. Creating different models for different cluster groups.

2. Creating an input feature for cluster ids as an ordinal variable.

3. Creating an input feature for cluster centroids as a continuous variable.

4. Creating an input feature for cluster size as a continuous variable.

A : 1 and 2

B : 1 and 4

C : 2 and 4

D : All of the abov

**Q. 8 What are the factors to select the depth of neural network?**

1.Type of neural network (eg. MLP, CNN etc)

2.Input data

3.Computation power, i.e. Hardware capabilities and software capabilities

4.Learning Rate

5.The output function to map

A : 1, 2, 4, 5

B : 2, 3, 4, 5

C : 1, 3, 4, 5

D : All of these

All of these

**Q. 9 I am working with the fully connected architecture having one hidden layer with 3 neurons and one output neuron to solve a binary classification challenge. Below is the structure of input and output: Input dataset: [ [1,0,1,0] , [1,0,1,1] , [0,1,0,1] ] Output: [ [1] , [1] , [0] ].To train the model, I have initialized all weights for hidden and output layer with 1. What do you say model will able to learn the pattern in the data?**

A : yes

B : no

b

no

**Q. 10 The k-means algorithm…**

A : always converges to a clustering that minimizes the mean-square vector-representative distance

B : can converge to different final clustering, depending on initial choice of representatives

C : is typically done by hand, using paper and pencil

D : should only be attempted by trained professionals

can converge to different final clustering, depending on initial choice of representatives

**Q. 11 Which of the following is required by K-means clustering?**

A : defined distance metric

B : number of clusters

C : initial guess as to cluster centroids

D : All of the above

All of the above

**Q. 12 For which of the following tasks might clustering be a suitable approach? **

1.Given sales data from a large number of products in a supermarket, estimate future sales for each of these products.

2.Given a database of information about your users, automatically group them into different market segments.

3.From the user’s usage patterns on a website, identify different user groups.

4.Given historical weather records, predict if tomorrow’s weather will be sunny or rainy.

A : 1,2

B : 2,3

C : 3,4

D : All of the above

2,3

**Q. 13 what is true about single linkage hierarchical clustering **

A : we merge in each step the two clusters, whose two closest members have the smallest distance.

B : we merge in the members of the clusters in each step, which provide the smallest maximum pairwise distance.

C : the distance between two clusters is defined as the average distance between each point in one cluster to every point in the other cluster.

D : none of the above

we merge in each step the two clusters, whose two closest members have the smallest distance.

**Q. 14 what is true about complete linkage hierarchical clustering **

A : we merge in each step the two clusters, whose two closest members have the smallest distance.

B : we merge in the members of the clusters in each step, which provide the smallest maximum pairwise distance.

C : the distance between two clusters is defined as the average distance between each point in one cluster to every point in the other cluster.

D : none of the above

we merge in the members of the clusters in each step, which provide the smallest maximum pairwise distance.

**Q. 15 what is true about average linkage hierarchical clustering **

A : we merge in each step the two clusters, whose two closest members have the smallest distance.

B : we merge in the members of the clusters in each step, which provide the smallest maximum pairwise distance.

C : the distance between two clusters is defined as the average distance between each point in one cluster to every point in the other cluster.

D : none of the above

the distance between two clusters is defined as the average distance between each point in one cluster to every point in the other cluster.

**Q. 16 divisive clustering is where**

A : the initial state is a single cluster with all samples and the process proceeds by splitting the intermediate cluster until all elements are separated

B : process starts from the bottom (each initial cluster is made up of a single element) and proceeds by merging the clusters until a stop criterion is reached.

C : requires prior knowledge of no. of clusters you want to divide your data into.

D : None of the above

the initial state is a single cluster with all samples and the process proceeds by splitting the intermediate cluster until all elements are separated

**Q. 17 Agglomerative clustering is **

A : the initial state is a single cluster with all samples and the process proceeds by splitting the intermediate cluster until all elements are separated

B : process starts from the bottom (each initial cluster is made up of a single element) and proceeds by merging the clusters until a stop criterion is reached.

C : requires prior knowledge of no. of clusters you want to divide your data into.

D : None of the above

process starts from the bottom (each initial cluster is made up of a single element) and proceeds by merging the clusters until a stop criterion is reached.

**Q. 18 Frequent visitors to some book sites often see lists of suggested titles based on their previous purchases at the site. Websites making book recommendations may be using all the following algorithms EXCEPT**

A : collaborative filtering

B : instrumental filtering

C : content-based filtering

D : rule-based filtering

instrumental filtering

**Q. 19 which of the following is valid type of recommendation**

A : expert filtering

B : smart filtering

C : reliable filtering

D : collaberative filtering

collaberative filtering

**Q. 1 In a neural network, knowing the weight and bias of each neuron is the most important step. If you can somehow get the correct value of weight and bias for each neuron, you can approximate any function. What would be the best way to approach this? **

A : Assign random values and pray to God they are correct

B : Search every possible combination of weights and biases till you get the best value

C : Iteratively check that after assigning a value how far you are from the best values, and slightly change the assigned values values to make them better

D : None of these

Iteratively check that after assigning a value how far you are from the best values, and slightly change the assigned values values to make them better

**Q. 2 What is the sequence of the following tasks in a perceptron?**

1.Initialize weights of perceptron randomly

2.Go to the next batch of dataset

3.If the prediction does not match the output, change the weights

4.For a sample input, compute an output”

A : 1, 2, 3, 4

B : 4, 3, 2, 1

C : 3, 1, 2, 4

D : 1, 4, 3, 2

1, 4, 3, 2

**Q. 5 In which neural net architecture, does weight sharing occur? **

A : Convolutional neural Network

B : Recurrent Neural Network

C : Fully Connected Neural Network

D : Both A and B

Both A and B

**Q. 6 What could be the possible reason(s) for producing two different dendrograms using agglomerative clustering algorithm for the same dataset?**

A : Proximity function used

B : of data points used

C : of variables used

D : all of the above

all of the above

**Q. 8 Which of the following statements is true when you use 1×1 convolutions in a CNN?**

A : It can help in dimensionality reduction

B : It can be used for feature pooling

C : It suffers less overfitting due to small kernel size

D : All of the above

All of the above

**Q. 9 The number of nodes in the input layer is 10 and the hidden layer is 5. The maximum number of connections from the input layer to the hidden layer are**

A : 50

B : less than 50

C : More than 50

D : It is an arbitrary value

50

**Q. 10Which of the following functions can be used as an activation function in the output layer if we wish to predict the probabilities of n classes (p1, p2..pk) such that sum of p over all n equals to 1?**

A : Softmax

B : ReLu

C : Sigmoid

D : Tanh

Softmax

**Q. 11 Which of the following statements are true? **

1. Graphs, time-series data, text, and multimedia data are all examples of data types on which cluster analysis can be performed.

2. Agglomerative clustering is an example of a hierarchical and distance-based clustering method.

3. When dealing with high-dimensional data, we sometimes consider only a subset of the dimensions when performing cluster analysis.

4. We can only visualize the clustering results when the data is 2-dime

A : 1,2,3

B : 2,3,4

C : 1,2,4

D : 1,2,3,4

1,2,3

**Q. 12 Which of the following is finally produced by Hierarchical Clustering?**

A : final estimate of cluster centroids

B : tree showing how close things are to each other

C : assignment of each point to clusters

D : all of the mentioned

tree showing how close things are to each other

**Q. 13 Which of the following clustering requires merging approach? **

A : Partitional

B : Hierarchical

C : Naive Bayes

D : None of the mentioned

Hierarchical

**Q. 15 In which of the following cases will K-Means clustering fail to give good results? **

1. Data points with outliers

2. Data points with different densities

3. Data points with round shapes

4. Data points with non-convex shapes

A : 1 and 2

B : 2 and 3

C : 2 and 4

D : 1, 2 and 4

1, 2 and 4

**Q. 16 How can Clustering (Unsupervised Learning) be used to improve the accuracy of Linear Regression model (Supervised Learning : **

1. Creating different models for different cluster groups.

2. Creating an input feature for cluster ids as an ordinal variable.

3. Creating an input feature for cluster centroids as a continuous variable.

4. Creating an input feature for cluster size as a continuous variable.

A : 1 and 4

B : 3 only

C : 2 and 4

D : All of the above

All of the above

**Q. 17 Why is an RNN (Recurrent Neural Network) used for machine translation, say translating English to French? (Check all that apply.) **

a. It can be trained as a supervised learning problem.

b. It is strictly more powerful than a Convolutional Neural Network (CNN).

c. It is applicable when the input/output is a sequence (e.g., a sequence of words).

d. RNNs represent the recurrent process of Idea->Code->Experiment->Idea->….

A : a and c

B : a,b and c

C : a and d

D : a,b and d

a and c

**Q. 18 which of the following is not application of Autoencoder**

A : predicting next word in a sentence

B : detecting anomalies in a signal

C : removing noise from an image,audio or scanned document

D : lowering the dimensions for better visualizations

predicting next word in a sentence

**Q. 19 Which of the following is true about Naive Bayes ?**

A : Assumes that all the features in a dataset are equally important

B : Assumes that all the features in a dataset are independent

C : Both A and B

D : None of the above options

Both A and B

**Q. 20 In which of the following cases will K-means clustering fail to give good results? **

1. Data points with outliers

2. Data points with different densities

3. Data points with nonconvex shapes

A : 1 and 2

B : 2 and 3

C : 1, 2, and 3

D : 1 and 3

1, 2, and 3

**Q. 21 Bias is**

A : A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory

B : Any mechanism employed by a learning system to constrain the search space of a hypothesis

C : An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation

D : None of the above

Any mechanism employed by a learning system to constrain the search space of a hypothesis