Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

. DastlBe tlE purposé of logistic regression as opposed to linear regression. is

ID: 2924650 • Letter: #

Question

. DastlBe tlE purposé of logistic regression as opposed to linear regression. is responsible for controlling the production output to 3The Quality manager for ABC Corporation reduce the chance that defective products are sent to customers. The manager is of automated machines that will scan each finished item and mark each as acceptable or defective. Previous test runs with the machines vs manual examination have given results shown in the following Classification Confusion Matrix. considering the purchase Predicted (according to machine) Actual (from manual exam) Acceptable Defective Acceptable 1543 522 Defective 98 892 a. Calculate the overall error rate, the Class 0 error rate, and the Class 1 error rate errof b. Describe the potential costs of a Class 0 error vs a Class 1 error for this scenario. c. Which type of error would be more harmful to the company? rares class Class 4. Using the Training Set Red or Blue that is posted with this assignment perform a k-3 Nearest Neighbor classification for the following new observations: Voter X Income 1 15, Education-12 Voter Y Income = 185, Education-18 The values of variables Income and Education for each observation in the training set and for Create a scatterplot of Normalized Income vs Normalized Education for the training set data. Determine the 3 points on your scatterplot that would be nearest to the new observation (should According to the Vote Classification (Red or Blue) of the nearest neighbors, determine what The following steps must be performed (watch the podcast): each new observation must be normalized (calculate the z-scores for each- see pg 50) be able to do this visually.) classification the new observation would be. Use a cutoff of 0.5, which means if the nearest neighbors m not all the same class, the majority wins. See example explanation on pg 455

Explanation / Answer

a) overall error rate is calculates as

total misclassifications / Total obervations

misclassifications are 98+522

total observations are 98+522+1543+892

so error rate = (98+522)/(98+522+1543+892 ) = 0.209 = 21% approx

class 0 error rate is

how many acceptable were classified as defective

98/(1543+98) = 0.059 ~ 6% approc

class 1 error rate is

how many defective were classified as acceptable

522/(522+892) = 0.369 ~ 37% approx

b)

cost matrix can be creating missclassification costs for wrong assignments

eg : if a acceptable product is classified as defective , it would cost must to the company , so lets keep it one

however , if a defective product is classified as acceptable , it would tarnish the image of the company and would impact continued business, so more weightage should be given such as 5.

c) classifying defective as acceptable would be more harmful for the company as this generates bad quality name for the company. when a product is defective it should be predicted and ruled out from the delivery process to ensure that the number of defetive items shipped to customers is less