Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

calculating correlation coefficient and p-value. To calculate the precise correl

ID: 3251761 • Letter: C

Question

calculating correlation coefficient and p-value. To calculate the precise correlation between year and votes, we will use a scipy stats function. rho, pvalue = stats .pear son r (movieData ['year '], movieData ['votes']) pr int "The correlation coefficient is ', rho pr int "The p-value is: ', pvalue The correlation coefficient is 0.427076783601 The p-value is: 1.12436337738e-312 P values are probabilities, and thus they are always values between 0 and 1 The technical definition of a p-value is the probability that a pair of variables are THIS strongly correlated by chance. In other words, generate this dataset, and you re-rolled that dataset N (a large number) of times, then you would expect to get a correlation coefficient AT LEAST AS STRONG AS THE ONE YOU ACTUALLY MEASURED N_x p times, just because of chance. A p-value close to 1 means you will almost always randomly arrive at such a relationship, so that is probably not a real, or "statistically significant'' relationship. Conversely, a small p-value means there is a very tiny chance that this relationship happened by chance, so it's probably "real", or "significant". N.B. Python may express the value of very large or very small numbers in scientific notation. If you see a number like 1.234e - 5 it means 1.234 times 10 to the power of - 5, which is 0.00001234 ANSWER THE FOLLOWING QUESTIONS (double click this cell to edit): What does the correlation coefficient you calculated above tell you about the two variables? Is this consistent with your estimate by eye, from part 1? ANSWER HERE: Judging by the p-value, is this relationship likely to have occurred randomly?

Explanation / Answer

Q. What does this correlation coefficient you calculated above tell you about the two variables?

Answer :- The numerical value of correlation coefficient calculated in this case is 0.427076783601. The correlation coefficient determines the strength of relationship between two variables. The more it is closer to 1 the more significant linear relationship is . The more it is close to -1, the more it is significant the negative relationship is. When it is 0 or close to it, then the dependence between the two variables is minimum.

In this case it somewhere between 0 and 1 and hence two variables are correlated but the correlation is not very strong.

Q. Judging by the p-value, is this relationship likely to have occured randomly?

Answer : The p-value in this case is 1.12436337738e-312. Since the P-value is very low, it is very unlikely that the relationship which is given here has been arrived randomly. Therefore this relationship that we have found must be significant.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at drjack9650@gmail.com
Chat Now And Get Quote