You have been given data on Electricity usage. You have data on the bill and a n
ID: 3061610 • Letter: Y
Question
You have been given data on Electricity usage. You have data on the bill and a number of pieces of data thought to explain/predict the bill. I want you to tell me what I need to know in order to make a good explanation/prediction of electricity bills. Using the data set “Electric Bill Data”
1. Perform a multiple regression analysis without the Meter, Heating, Cooling dummy variables. Use the 5% level of significance.
2. Fully explain the results. This means discuss/interpret each term and its significance, explain what the coefficients mean, the R-square, and F-value.
3.Check for autocorrelation.
4. Check for multicollinearity.
Time Year Month Bill $ Temp Size New EMeter NewHeat NewCool 1 1991 Jan 162.1 29.1 4 0 0 0 2 1991 Feb 256.9 31.5 4 0 0 0 3 1991 Mar 151.15 41.9 4 0 0 0 4 1991 Apr 118.76 53.4 4 0 0 0 5 1991 May 100.71 63.7 4 0 0 0 6 1991 Jun 83.97 72.9 4 0 0 0 7 1991 Jul 99.4 76.8 4 0 0 0 8 1991 Aug 103.64 75 4 0 0 0 9 1991 Sep 71.59 68.4 4 0 0 0 10 1991 Oct 94.92 56.8 4 0 0 0 11 1991 Nov 78.83 43.3 4 0 0 0 12 1991 Dec 182.85 32.5 4 0 0 0 13 1992 Jan 165.23 29.1 4 0 0 0 14 1992 Feb 197.94 31.5 4 0 0 0 15 1992 Mar 146.8 41.9 4 0 0 0 16 1992 Apr 127.17 53.4 4 0 0 0 17 1992 May 112.4 63.7 4 0 0 0 18 1992 Jun 70.35 72.9 4 0 0 0 19 1992 Jul 72.98 76.8 4 0 0 0 20 1992 Aug 82.8 75 4 0 0 0 21 1992 Sep 70.33 68.4 4 0 0 0 22 1992 Oct 76.13 56.8 4 0 0 0 23 1992 Nov 75.54 43.3 4 0 0 0 24 1993 Dec 130.32 32.5 4 0 0 0 25 1993 Jan 180.08 29.1 4 0 0 0 26 1993 Feb 222.99 31.5 4 0 0 0 27 1993 Mar 214.45 41.9 4 0 0 0 28 1993 Apr 160.98 53.4 4 0 0 0 29 1993 May 99.34 63.7 4 0 0 0 30 1993 Jun 75.31 72.9 4 0 0 0 31 1993 Jul 134.99 76.8 4 0 0 0 32 1993 Aug 100.22 75 4 0 0 0 33 1993 Sep 90.79 68.4 4 0 0 0 34 1993 Oct 64.62 56.8 4 0 0 0 35 1993 Nov 72.79 43.3 4 0 0 0 36 1993 Dec 135.56 32.5 4 0 0 0 37 1994 Jan 106.63 29.1 4 0 0 0 38 1994 Feb 158.59 31.5 4 0 0 0 39 1994 Mar 183.06 41.9 4 0 0 0 40 1994 Apr 115.95 53.4 4 0 0 0 41 1994 May 82.27 63.7 4 0 0 0 42 1994 Jun 69.33 72.9 4 0 0 0 43 1994 Jul 91.38 76.8 4 0 0 0 44 1994 Aug 80.28 75 4 0 0 0 45 1994 Sep 70 68.4 4 0 0 0 46 1994 Oct 56.47 56.8 4 0 0 0 47 1994 Nov 56.47 43.3 4 0 0 0 48 1994 Dec 71.63 32.5 4 0 0 0 49 1995 Jan 129.06 29.1 4 0 0 0 50 1995 Feb 154.32 31.5 4 0 0 0 51 1995 Mar 174.8 41.9 4 0 0 0 52 1995 Apr 113.25 53.4 4 0 0 0 53 1995 May 94.23 63.7 4 0 0 0 54 1995 Jun 69.79 72.9 4 0 0 0 55 1995 Jul 89.06 76.8 4 0 0 0 56 1995 Aug 96.29 75 4 0 0 0 57 1995 Sep 107.3 68.4 4 0 0 0 58 1995 Oct 77.9 56.8 4 0 0 0 59 1995 Nov 74.67 43.3 4 0 0 0 60 1995 Dec 150.11 32.5 4 0 0 0 61 1996 Jan 218.67 29.1 4 0 0 0 62 1996 Feb 256.02 31.5 4 0 0 0 63 1996 Mar 222.21 41.9 4 0 0 0 64 1996 Apr 189.44 53.4 4 0 0 0 65 1996 May 107.82 63.7 4 0 0 0 66 1996 Jun 77.76 72.9 4 0 0 0 67 1996 Jul 70.31 76.8 4 0 0 0 68 1996 Aug 80.85 75 4 0 0 0 69 1996 Sep 56.05 68.4 4 0 0 0 70 1996 Oct 49.53 56.8 4 0 0 0 71 1996 Nov 54.01 43.3 4 0 0 0 72 1996 Dec 107.94 32.5 4 0 0 0 73 1997 Jan 155.99 29.1 4 0 1 0 74 1997 Feb 198.82 31.5 4 0 1 0 75 1997 Mar 106.51 41.9 4 0 1 0 76 1997 Apr 81.45 53.4 4 0 1 0 77 1997 May 75.95 63.7 4 0 1 0 78 1997 Jun 60.83 72.9 4 0 1 0 79 1997 Jul 58.2 76.8 4 0 1 0 80 1997 Aug 65.61 75 4 0 1 0 81 1997 Sep 57.78 68.4 3 0 1 0 82 1997 Oct 49.4 56.8 3 0 1 0 83 1997 Nov 52.86 43.3 3 0 1 0 84 1997 Dec 78.13 32.5 3 0 1 0 85 1998 Jan 100.43 29.1 3 0 1 0 86 1998 Feb 102.64 31.5 3 0 1 0 87 1998 Mar 66.77 41.9 3 0 1 0 88 1998 Apr 65.68 53.4 3 0 1 0 89 1998 May 37.03 63.7 4 0 1 0 90 1998 Jun 37.88 72.9 4 0 1 0 91 1998 Jul 42.44 76.8 4 0 1 0 92 1998 Aug 46.18 75 4 0 1 0 93 1998 Sep 42.87 68.4 3 0 1 0 94 1998 Oct 41.99 56.8 3 0 1 1 95 1998 Nov 34.49 43.3 3 0 1 1 96 1998 Dec 40.97 32.5 3 0 1 1 97 1999 Jan 62.5 29.1 3 0 1 1 98 1999 Feb 93.9 31.5 3 0 1 1 99 1999 Mar 45.36 41.9 3 0 1 1 100 1999 Apr 44.64 53.4 3 0 1 1 101 1999 May 28.23 63.7 4 0 1 1 102 1999 Jun 22.84 72.9 4 0 1 1 103 1999 Jul 27.96 76.8 4 0 1 1 104 1999 Aug 0 75 4 0 1 1 105 1999 Sep 153.32 68.4 3 1 1 1 106 1999 Oct 86.03 56.8 3 1 1 1 107 1999 Nov 80.96 43.3 3 1 1 1 108 1999 Dec 92.15 32.5 3 1 1 1 109 2000 Jan 126.86 29.1 3 1 1 1 110 2000 Feb 155.45 31.5 3 1 1 1 111 2000 Mar 121.19 41.9 3 1 1 1 112 2000 Apr 190.62 53.4 3 1 1 1 113 2000 May 115.7 63.7 4 1 1 1 114 2000 Jun 81.35 72.9 3 1 1 1 115 2000 Jul 91.23 76.8 3 1 1 1 116 2000 Aug 86.28 75 3 1 1 1 117 2000 Sep 83.97 68.4 2 1 1 1 118 2000 Oct 77.65 56.8 2 1 1 1 119 2000 Nov 72.3 43.3 2 1 1 1 120 2000 Dec 89.49 32.5 2 1 1 1Explanation / Answer
You need to know about how to use multiple linear regression to predict a dependent variable (Bill in this case) as a function of other explanatory variables (Temp, Size, etc.). You can do it using R or Python programming, or even in Microsoft Excel.
First of all, you can do some data exploration by plotting some charts to understand the data. For example, scatter plot of Bill versus Temp columns to see if there is any trend. You can also do box and whisker plot for bill amount versus other categorical variables such as Size, NewEMeter, NewHeat, NewCool. Then you can build a regression model. You can do it from scratch for which you need the knowledge of "ordinary least squares" and "gradient descent". But it would be easier to use R programming (e.g. lm() function in R) or Data Analysis capability in Excel.
To check for autocorrelation, you can do a time series plot for the bill amount (year-month on x-axis and Bill on y-axis) for a visual inspection. There is acf() function in R to compute correlations for different levels of time lag.
Multicollinearity can happen if some of explanatory variables are corrleated among themselves. For a basic check, you can do a pairwise correlation matrix for the explanatory variables.
If you are a beginner and would like to learn in depth about this, there is a great book called "introduction to statistical learning" which is available for free online. Hope this helps.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.