Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Do in R - Find predicated value using analysis of this data set Please do this i

ID: 3360547 • Letter: D

Question

Do in R - Find predicated value using analysis of this data set

Please do this in R.

Consumer Research is an independent agency that conducts research on consumer attitudes and behaviors for a variety of firms. In this study, a client asked for an investigation of the consumer characteristics that can be used to predict the amount charged by credit card users. Data from 2002 info were collected on the annual income (in 000’s), household size and annual credit card charges for a sample of consumers.

Predict the annual credit card charge for a 3-person household with an annual income of $50,000. Also find confidence interval and prediction interval, if you can.

Consumer data set

data Consumer ;

input income housesize   amtcharged;

datalines;

55.00   3.00    4116.00

31.00   2.00    3159.00

32.00   4.00    5100.00

51.00   5.00    4742.00

31.00   2.00    1864.00

55.00   2.00    4070.00

37.00   1.00    2731.00

40.00   2.00    3348.00

66.00   4.00    4764.00

51.00   3.00    4110.00

25.00   3.00    4208.00

48.00   4.00    4219.00

27.00   1.00    2477.00

33.00   2.00    2514.00

65.00   3.00    4214.00

63.00   4.00    4965.00

42.00   6.00    4412.00

21.00   2.00    2448.00

44.00   1.00    2995.00

37.00   5.00    4171.00

62.00   6.00    5678.00

21.00   3.00    3623.00

55.00   7.00    5301.00

42.00   2.00    3020.00

41.00   7.00    4828.00

54.00   6.00    5573.00

30.00   1.00    2583.00

48.00   2.00    3866.00

34.00   5.00    3586.00

67.00   4.00    5037.00

50.00   2.00    3605.00

67.00   5.00    5345.00

55.00   6.00    5370.00

52.00   2.00    3890.00

62.00   3.00    4705.00

64.00   2.00    4157.00

22.00   3.00    3579.00

29.00   4.00    3890.00

39.00   2.00    2972.00

35.00   1.00    3121.00

39.00   4.00    4183.00

54.00   3.00    3730.00

23.00   6.00    4127.00

27.00   2.00    2921.00

26.00   7.00    4603.00

61.00   2.00    4273.00

30.00   2.00    3067.00

22.00   4.00    3074.00

46.00   5.00    4820.00

66.00   4.00    5149.00

60.00   4.00    5002.00

32.00   3.00    3100.00

;

proc print;

run;

Explanation / Answer

# Multiple Linear Regression

# Importing the dataset
dataset = read.csv('chegg.csv')
dataset$income = dataset$income*1000

dataset[53,] = c(50000,3,0)
dataset = dataset[-53,]
summary(dataset)

d = as.data.frame(scale(dataset))

# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(d$amtcharged, SplitRatio = 0.8)
training_set = subset(d, split == TRUE)
test_set = subset(d, split == FALSE)
rm(d)
# Feature Scaling

# Fitting Multiple Linear Regression to the Training set

regressor = lm(formula = amtcharged ~ .,
data = training_set)

# Predicting the Test set results
y_pred = predict(regressor, newdata = test_set[-3])

plot(y_pred, test_set$amtcharged)

#Predict the annual credit card charge for a 3-person household
#with an annual income of $50,000. Also find confidence interval and prediction interval,
#if you can.
newdata = rbind(dataset, data.frame(income = 50000, housesize = 3, amtcharged = NA))
newdata = as.data.frame(scale(newdata))
newdata = newdata[53,]
predVal = predict(regressor, newdata = newdata[-3])

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote