3. Suppose we run an online news aggregation platform which observes a person\'s
ID: 3298434 • Letter: 3
Question
3. Suppose we run an online news aggregation platform which observes a person's habits of reading online news articles in order to recommend other articles the user might want to read. Suppose that we have characterised each online article by whether it mentions celebrity, whether it features sports, whether it features politics, and whether it features technology. Suppose we are given the examples below about whether a specific user reads various online articles. We want to use this data set to train a model that predicts which article this person would like to read based on the mentioned features. (a) Suppose you would like to use decision tree learning for this task. Apply the TDIDT algorithm to learn a decision tree. Stop extending a branch of the decision tree when all remaining training examples have the same target feature. Demonstrate how you compute the information gain for (7 marks) the features at each node. Draw the resulting decision tree. (b) Suppose you would like to use naive Bayesian learning for this task. Apply naive Bayesian learning algorithm to approximate P(Reads true), P(X | Reads-true), and P(X Reads falseExplanation / Answer
To calculate information gain
Entropy=-plog2p-qlog2q
information gain=1-Entropy
Steps to calculate entropy for a split:
Entropy of parent node:--6/13*log2(6/13)-7/13*log2(7/13)=0.995727
Entropy for individual node and information gain
TRUE FALSE Entropy Information Gain Celebrity 0.863121 0.918296 0.888586 0.111414 Sports 1 0.985228 0.992046 0.007954 Politics 0.918296 0.863121 0.888586 0.111414 Technology 0.970951 1 0.988827 0.011173Related Questions
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.