ML-First,Mid-Semester-Regular,2017-2018


Birla Institute of Technology & Science, Pilani
Work-Integrated Learning Programmes Division
First Semester 2017-2018

Mid-Semester Test
(EC-2 Regular/ Make-up)

Course No.                  :  IS ZC464  
Course Title                 :  MACHINE LEARNING  
Nature of Exam          :  Closed Book
Weightage                   :  30%   
Duration                      :  2 Hours 
Date of Exam              :  24/09/2017   (FN)
Note:
1.       Please follow all the Instructions to Candidates given on the cover page of the answer book.
2.       All parts of a question should be answered consecutively. Each answer should start from a fresh page. 
3.       Assumptions made if any, should be stated clearly at the beginning of your answer.

Q.1.         Let data D consists of just n coin flip in which α1 are number of heads and α0 are number of tails [n = α1 + α0]. Assume that flips are independent and identically distributed (i.i.d.).

Let X be a binary random variable which represents a coin.
X= 1; if coin flips to heads
                                     X= 0; if coin flips to tails
Let Ө refers to the true probability of head (P(X=1) = Ө)

a)      Estimate Ө by Maximum Likelihood Estimation (MLE).                                                 [3]
b)      If Beta distribution is used as prior, show that Maximum a Posteriori Probability Estimation (MAP) of Ө is
                                                                                     [3]
Beta distribution:
 is just a normalizing constant.

Q.2.        Consider a training data set {xn, tn} where n = 1, …, N.  A polynomial function of the form
 is used to fit the data.

(a)    Write the sum of squared error function without using vector notations.                         [1]
(b)   What happens if N = 10 and M = 15? Discuss.                                                                 [1]
(c)    What do you understand by linearly separable data set?                                                   [1]
(d)   Explain the following terms in not more than three lines .
(i)                 Good generalization.                                                                                           [0.5]
(ii)               Hypothesis set.                                                                                                    [0.5]
(iii)             Supervised learning.                                                                                            [0.5]
(iv)             Regularization.                                                                                                    [0.5]
    
    




IS ZC464 (EC-2 Regular)                  First Semester 2017-2018                                               Page 2

Q.3.        Consider the data set given in the following table:
Outlook
Temperature
Humidity
PlayTennis
Overcast
Cool
Normal
Yes
Overcast
Hot
High
Yes
Overcast
Hot
High
Yes
Sunny
Cool
Normal
Yes
Overcast
Cool
Normal
No
Sunny
Hot
High
No
Sunny
Hot
High
No
(a)    Estimate all the parameters of Naïve Bayes classifier from the data set given in the above table.                                                                                                                            [3]
(b)   Using the above estimated parameters classify the following instance  
< Outlook = Sunny, Temperature =Hot, Humidity =Normal>                                     [1]
(c)    Discuss the difference between generative and discriminative classifiers.  Give example for each of them.                                                                                                                  [2]

Q.4.        Let there are three hypothesis h1, h2, h3 in the hypothesis space. Suppose that the posterior probabilities of three hypothesis given the data set, D are as follows:
P(h1|D) = 0.4,             P(h2|D) = 0.3,              P(h3|D) = 0.3 

Suppose new instance, x is encountered, which is classified positive by h1, but negative by h2 and h3.
(a)    Which hypothesis is MAP hypothesis? Explain.                                                                [1]
(b)   Classify new instance x using Bayes optimal classifier.                                                     [3]
(c)    Write Gibbs Algorithm.                                                                                                     [2]
                                                                                                                    

Q.5.        Consider a collection S, containing positive and negative examples of some target function. Assume it has two attributes Humidity = {High, Normal} and Wind = {Weak, Strong}. What is the information Gain for both attributes for the given data?
You can use the notation for Information Gain of an attribute A as Gain(S, A).

More precisely you need to calculate Gain(S, Humidity) =?  and Gain(S, Wind) =? 
[3 + 3 = 6]
Which attribute is best classifier?                                                                                             [1]
 



           
            Where + represents the positive examples and – represents the negative examples. 

************

No comments:

Post a Comment