Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

(20 pts) For each of the following machine learning scenarios, several possible

ID: 3700312 • Letter: #

Question

(20 pts) For each of the following machine learning scenarios, several possible input variables are listed. For each variable, briefly justify (~1 sentence) whether you think it would be:

a. useful and practical (i.e. can reasonably be collected and you expect it would make the model more accurate)

b. useful but impractical (i.e. if it could be collected, it would improve the model, but it is impossible, or prohibitively expensive, to collect)

c. not useful (i.e. would not improve the model, even if it could be collected)

Scenarios:

1. Predicting a student's grade on the second midterm: grade on the first midterm; grade in CS2; amount of time studying; number of characters on the student's cheat sheet; amount of coffee consumed in the last week

2. Predicting whether it will rain in Rochester tomorrow: whether it rained in Rochester today; whether it rained in Cleveland today; all 100-square-meter areas in the US where it rained yesterday; the wind speed in Rochester yesterday; the day of the week; the month of the year

3. Predicting the score of RIT's next hockey game: the average heights of the players on RIT's team and the opponent's team; the temperature outside at game time; the opponent's win/loss record this season; the number of goals scored per minute of each player on both teams so far this season; the number of goals scored per kilometer skated by each player so far this season

4. Predicting whether you will like a restaurant: the opinions of the last hundred people to eat there; the number of stars in the Yelp review of the restaurant; the type of food; the number of insects in the kitchen

Explanation / Answer

1. grade on the first midterm-a: Usefull for the assesment of student potential there by predicting the 2nd midterm score; grade in cs2-b: partially helpful for determining the grade but cannot completely depend on the individual subject score; amount of time studying-a: the study time is used to estimate the student knowledge, i.e time is directly proportional to grade; number of characters in student's cheat sheet-c: this does not plays any impact on the grade of the student; amount of coffee consumed on the last week-c: coffee consumption can be used to make an individual to wake up in nights, but it doesn't mean that student is studying for good grades.

2. whether it rained in Rochester today-a: If it rained today, there may be possibility to predict the near future; whether it rained in Cleveland today-a: the clouds travel from one location to another, so this can be considered; all 100-square-meter areas in the US where it rained yesterday-a: possibility for rain fall under 100 square meter range in any area is high compared to other; the wind speed in Rochester yesterday-a: the speed of wind effects the cloud moments and direction;  the day of the week-c: the day of week does not impact the rain or climatic condition; the month of the year-a: if it is rainy season, the chances are more compared to that of other seasons.

3. the average heights of the players on RIT's team and the opponent's team-c: the height does not matter in the field of hockey; the temperature outside at game time-b: if the temperature is more then the players may exhaust more and the goals will be less, but this is not directly related to the score as if the play is done inside; the opponent's win/loss record this season-a: the opponent's records can determine the capacity and outcome of the game; the number of goals scored per minute of each player on both teams so far this season-a: if the number of goals scored by RIT is greater than opponents, then we can predict the winning chance; the number of goals scored per kilometer skated by each player so far this season-a: the number of goals determine the players ability or professionalism.

4. the opinions of the last hundred people to eat there-a: if the opinions are good, then the restaurant may be liked by more; the number of stars in the Yelp review of the restaurant-a: if the stars are high for each review, the restaurent is liked by more; the type of food-a: most people find restaurents by the type of food they provide; the number of insects in the kitchen-a: if there are insects in kitchen, then there is possibility for less quality which is directly proportional to unhealthy food and results in as bad restaurent.