Question 6 (3 points) A professor is interested in finding out whether the avera

ID: 3708673 • Letter: Q

Question

Question 6 (3 points)

A professor is interested in finding out whether the average score in the second exam is the same as the average score in the first exam. Suppose two samples are collected for the two exams and saved in the ExamScores.csv file. The variables are called Exam1 and Exam2 respectively. Which of the following Python lines can be used to perform a hypothesis test to investigate if there is sufficient evidence to conclude that average score in the second exam is not equal to the first exam?

Question 6 options:

import scipy.stats as st
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
exam1_scores = scores[['Exam1']]
exam2_scores = scores[['Exam2']]
null_value = 0
alternative = 'not-equal'
print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False, null_value, alternative))

import scipy.stats as st
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
exam1_scores = scores[['Exam1']]
exam2_scores = scores[['Exam2']]
print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False))

import scipy.stats as st
scores = pd.read_csv('ExamScores.csv')
exam1_scores = scores[['Exam1']]
exam2_scores = scores[['Exam2']]
print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False))

import scipy.stats as st
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
print(st.ttest_ind(scores))

Save

Question 7 (3 points)

Commute times from town A to town B are obtained for two different highways. The sample size obtained for the first highway is 40, and it is found that the average commute time is 5.35 hours with a standard deviation of 3.1 hours. The sample size obtained for the second highway is 50, and it is found that the average commute time is 4.95 hours with a standard deviation of 5.8 hours. Which of the following Python lines along with this summary data can be used to perform a hypothesis test to conclude whether or not the population means are different?

Question 7 options:

from scipy.stats import ttest_ind_from_stats as ttest_ind
n1 = 40
mean1 = 5.35
stdev1 = 3.1
n2 = 50
mean2 = 4.95
stdev2 = 5.8
print(ttest_ind(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False))

from scipy.stats import ttest_ind_from_stats as ttest
n1 = 40
mean1 = 5.35
stdev1 = 3.1
n2 = 50
mean2 = 4.95
stdev2 = 5.8
print(ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False))

from scipy.stats import ttest_ind_from_stats as ttest
n1 = 40
mean1 = 4.95
stdev1 = 5.8
n2 = 50
mean2 = 5.35
stdev2 = 3.1
print(ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False))

from scipy.stats import ttest_ind_from_stats as ttest
n1 = 40
n2 = 50
print(ttest(n1, n2, equal_var=False))

Save

Question 8 (3 points)

A professor is interested in finding out whether a higher proportion of students score a higher grade than 90 in the second exam as compared to the first exam. Suppose two samples are collected for the two exams and saved in the ExamScores.csv file. The variables are called Exam1 and Exam2 respectively. Which of the following Python lines can be used to perform a hypothesis test to investigate if there is sufficient evidence to conclude that the proportion of students scoring more than 90 is higher in exam 2 compared to exam 1?

Question 8 options:

from statsmodels.stats.proportion import proportions_ztest
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
proportions_ztest(scores)
#Divide the output probability value by 2 to get 1 tailed probability value

from statsmodels.stats.proportion import prop_1samp_ztest
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
x1 = scores[['Exam1']].count()[0]
x2 = scores[['Exam2']].count()[0]
n1 = (scores[['Exam1']] > 90).values.sum()
n2 = (scores[['Exam2']] > 90).values.sum()
counts = [x1, x2]
n = [n1, n2]
null_value = 0.50
alternative = 'larger'
prop_1samp_ztest(counts, n, null_value, alternative)

from statsmodels.stats.proportion import proportions_ztest
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
n1 = scores[['Exam1']].count()[0]
n2 = scores[['Exam2']].count()[0]
x1 = (scores[['Exam1']] > 90).values.sum()
x2 = (scores[['Exam2']] > 90).values.sum()
counts = [x1, x2]
n = [n1, n2]
proportions_ztest(counts, n)
#Divide the output probability value by 2 to get 1 tailed probability value

from statsmodels.stats.proportion import proportions_ztest
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
x1 = scores[['Exam1']].count()[0]
x2 = scores[['Exam2']].count()[0]
n1 = (scores[['Exam1']] > 90).values.sum()
n2 = (scores[['Exam2']] > 90).values.sum()
counts = [x1, x2]
n = [n1, n2]
proportions_ztest(counts, n)
#Divide the output probability value by 2 to get 1 tailed probability value

Save

Question 9 (3 points)

Which of the following Python functions is used to perform a hypothesis test for the difference in two population means using data from a sample (i.e., using actual sample data and not using summary data)?

Question 9 options:

ttest_ind(data1, data2, equal_var=False)

means_1samp_ttest(mean, std_dev, n, null_value, alternative)

ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)

ttest_rel(data1, data2)

Save

Question 10 (3 points)

Which of the following Python functions is used to perform a paired t-test?

Question 10 options:

ttest_ind(data1, data2, equal_var=False)

ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)

ttest_rel(data1, data2)

means_1samp_ttest(mean, std_dev, n, null_value, alternative)

import scipy.stats as st
scores = pd.read_csv('ExamScores.csv')
exam1_scores = scores[['Exam1']]
exam2_scores = scores[['Exam2']]
print(st.ttest_ind(exam1_scores, exam2_scores, equal_var=False))

import scipy.stats as st
import pandas as pd
scores = pd.read_csv('ExamScores.csv')
print(st.ttest_ind(scores))

Explanation / Answer

Question 6:
Ans:
a)

Question 7:
Ans:
d)

from scipy.stats import ttest_ind_from_stats as ttest
n1 = 40
n2 = 50
print(ttest(n1, n2, equal_var=False))

Question 8:
Ans:
c)

Question 9:
Ans:
c)

ttest(mean1, stdev1, n1, mean2, stdev2, n2, equal_var=False)

Navigate

Question 6 (25 marks). Carmen Sandiego receives utility from days spent travelin

Question 6 (3pts). A DNA fragment contains two operator sites for the Lac Repres

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Question 6 (3 points) A professor is interested in finding out whether the avera

Question

Explanation / Answer

Related Questions

Navigate