2) Movies.xlsx has gross revenues, length, viewer rating on a 10 point scale, bu
ID: 3152767 • Letter: 2
Question
2) Movies.xlsx has gross revenues, length, viewer rating on a 10 point scale, budget, year and Movie Picture Association of America (MPAA) rating for 35 movies. We want to predict gross revenues. The MPAA rating is text with three categories (R, PG-13 and PG, there are no G or NC-17 movies in this sample) so two indicator variables have been created to represent these data
c) Run a series of four simple linear regressions. In each of them use gross revenues as the dependent variable. The independent variable should be, in turn, length, viewer rating, budget and year. Evaluate these four regression models and determine which single independent variable is the best predictor of gross revenues. Explain your reasoning. Are there any of these variables that show no real relationship to gross revenues?
Title Rating Gross ($M) Length (minutes) Viewer Rating Budget ($M) Year Rate-R Rate-PG13 Aliens R 81.843 137 8.2 18.5 1986 1 0 Armageddon PG-13 194.125 144 6.7 140 1998 0 1 As Good As It Gets PG-13 147.54 138 8.1 50 1997 0 1 Braveheart R 75.6 177 8.3 72 1995 1 0 Chasing Amy R 12.006 105 7.9 0.25 1997 1 0 Contact PG 100.853 153 8.3 90 1997 0 0 Dante's Peak PG-13 67.155 112 6.7 104 1997 0 1 Deep Impact PG-13 140.424 120 6.4 75 1998 0 1 Executive Decision R 68.75 129 7.3 55 1996 1 0 Forrest Gump PG-13 329.691 142 7.7 55 1994 0 1 Ghost PG-13 217.631 128 7.1 22 1990 0 1 Good Will Hunting R 138.339 126 8.5 10 1997 1 0 Grease PG 181.28 110 7.3 6 1978 0 0 Halloween R 47 93 7.7 0.325 1978 1 0 Hard Rain R 19.819 95 5.2 70 1998 1 0 I Know What You Did Last Summer R 72.219 100 6.5 17 1997 1 0 Independence Day PG-13 306.124 142 6.6 75 1996 0 1 Indiana Jones and the Last Crusade PG-13 197.171 127 7.8 39 1989 0 1 Jaws PG 260 124 7.8 12 1975 0 0 Men in Black PG-13 250.147 98 7.4 90 1997 0 1 Multiplicity PG-13 20.1 117 6.8 45 1996 0 1 Pulp Fiction R 107.93 154 8.3 8 1994 1 0 Raiders of the Lost Ark PG 242.374 115 8.3 20 1981 0 0 Saving Private ryan R 178.091 170 9.1 70 1998 1 0 Schindler's List R 96.067 197 8.6 25 1993 1 0 Scream R 103.001 111 7.7 15 1996 1 0 Speed 2:Cruise Control PG-13 48.068 121 4.3 110 1997 0 1 Terminator R 36.9 108 7.7 6.4 1984 1 0 The American President PG-13 65 114 7.6 62 1995 0 1 The Fifth Element PG-13 63.54 126 7.8 90 1997 0 1 The Game R 48.265 128 7.6 50 1997 1 0 The Man in the Iron Mask PG-13 56.876 132 6.5 35 1998 0 1 Titanic PG-13 600.743 195 8.4 200 1997 0 1 True Lies R 146.261 144 7.2 100 1994 1 0 Volcano PG-13 47.474 102 5.8 90 1997 0 1 Source:http://people.sc.fsu.edu/~jburkardt/datasets/triola/triola.html Dollar values are uninflatedExplanation / Answer
Regression Analysis: Gross ($M) versus Length (minutes)
The regression equation is
Gross ($M) = - 121 + 1.99 Length (minutes)
Predictor Coef SE Coef T P
Constant -121.04 93.09 -1.30 0.202
Length (minutes) 1.9861 0.7050 2.82 0.008
S = 106.524 R-Sq = 19.4% R-Sq(adj) = 16.9%
Analysis of Variance
Source DF SS MS F P
Regression 1 90056 90056 7.94 0.008
Residual Error 33 374460 11347
Total 34 464517
The regression equation is
Gross ($M) = - 103 + 32.3 Viewer Rating
Predictor Coef SE Coef T P
Constant -103.2 146.5 -0.70 0.486
Viewer Rating 32.33 19.60 1.65 0.109
S = 114.038 R-Sq = 7.6% R-Sq(adj) = 4.8%
Analysis of Variance
Source DF SS MS F P
Regression 1 35361 35361 2.72 0.109
Residual Error 33 429156 13005
Total 34 464517
Regression Analysis: Gross ($M) versus Budget ($M)
The regression equation is
Gross ($M) = 74.7 + 1.12 Budget ($M)
Predictor Coef SE Coef T P
Constant 74.69 29.12 2.57 0.015
Budget ($M) 1.1176 0.4135 2.70 0.011
S = 107.353 R-Sq = 18.1% R-Sq(adj) = 15.6%
Analysis of Variance
Source DF SS MS F P
Regression 1 84206 84206 7.31 0.011
Residual Error 33 380311 11525
Total 34 464517
Regression Analysis: Gross ($M) versus Year
The regression equation is
Gross ($M) = 4157 - 2.02 Year
Predictor Coef SE Coef T P
Constant 4157 6204 0.67 0.507
Year -2.017 3.113 -0.65 0.521
S = 117.895 R-Sq = 1.3% R-Sq(adj) = 0.0%
Analysis of Variance
Source DF SS MS F P
Regression 1 5839 5839 0.42 0.521
Residual Error 33 458677 13899
Total 34 464517
So, on running a series of four simple linear regressions each time having one different independent variable we found that length(minutes) is the best predictor of Gross revenues as it has the highest R2 and R2 adj(highlighted in bold)value among the four as maximum amount of variablilty among the 4 is explained by length.
Variable "Year" does not show any kind of relationship with Gross Revenue also variable "Rating" does not show any significant relationship with Gross Revenue as R2 value is very low
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.