You are given three data files (data1.txt, data2.txt, and data3.txt) consisting
ID: 3590098 • Letter: Y
Question
You are given three data files (data1.txt, data2.txt, and data3.txt) consisting of integers (1 integer on each line). Write Python code that reads each of these three files, computes sample mean and standard deviation for each data set and write them on a file, output.txt. For this problem, you should import simple_ds.py and use the mean function rather than including the code directly in your script. Next, combine the data of the three data sets in these data files and draw a histogram as well as print the result (combined data) on the screen. The output to the screen must be in a reader friendly format. Use matplotlib (as demonstrated in hist4.py) to plot a histogram of the combined data and not matlab. Save the plot in a .png format. Submission for this problem: Python code + Screen capture of cmd + histogram(.png)
Data 1
1
2
3
4
5
8
9
9
10
0
12
1
1
1
2
0
2
3
3
Data 2
4
5
1
2
7
8
8
9
0
1
2
3
4
5
Data 3
1
2
3
1
1
6
10
23
11
11
24
55
hist4.py
# -------------------------------------
# hist4.py - reads a file consisting of
# integers (1 integer on each line) and
# counts the number of occurances of
# each value (a histogram) and generates a
# histogram plot using matplotlib.
#
# 2015-09-02 - jeff smith
#
# $Id: $
# -------------------------------------
import matplotlib.pyplot as plt
# read the values
vals = [int(i.rstrip()) for i in open('data.txt','r') if i.rstrip()]
# dictionary to hold the counts
hist = {}
# loop through each unique value
for i in range(min(vals),max(vals)+1):
# key = integer, value = count
hist[i] = vals.count(i)
# get and sort the keys
skeys = hist.keys()
skeys.sort()
# display
for key in skeys:
print "{:3d} : {}".format(key, hist[key])
# Graphical version
plt.figure(1, figsize=(5,3))
plt.yticks(fontsize=8)
plt.xticks(fontsize=8)
plt.hist(vals, bins=15, normed=False)
plt.title('Observed Counts',fontsize=10)
plt.show()
Explanation / Answer
#code to be copied
import matplotlib.pyplot as plt
import simple_ds.py # assuming mean calculates mean of list of values
#naming all the files
filename1 = "data1.txt"
filename2 = "data2.txt"
filename3 = "data3.txt"
filename4 = "output.txt"
#opening the files for read operations
file1 = open( filename1, "r" )
file2 = open( filename2, "r" )
file3 = open( filename3, "r" )
#opening file for write operation
file4 = open( filename4, "w" )
#to store all the integers in the file
vals = []
#loops to append values to vals
#and writing mean to output.txt
for line in file1:
vals.append(int(line))
file4.write(mean(vals))
vals = []
for line in file2:
vals.append(int(line))
file4.write(mean(vals))
vals = []
for line in file3:
vals.append(int(line))
file4.write(mean(vals))
#hist to store counts of each variable value
hist = {}
vals = [int(i.rstrip()) for i in file1 if i.rstrip()]
for i in range( min(vals), max(vals)+1 ):
hist[i] = vals.count(i)
vals = [int(i.rstrip()) for i in file2 if i.rstrip()]
for i in range( min(vals), max(vals)+1 ):
hist[i] = hist[i] + vals.count(i)
vals = [int(i.rstrip()) for i in file3 if i.rstrip()]
for i in range( min(vals), max(vals)+1 ):
hist[i] = hist[i] + vals.count(i)
skeys = hist.keys()
skeys.sort()
vals = []
for key in skeys:
print "{:3d}:{}".format( key, hist[key] )
vals.append(hist[key])
#Graphical Output
plt.figure( 1, figsize = (5,3) )
plt.yticks( fontsize = 8 )
plt.xticks( fontsize = 8 )
plt.hist( vals, bins = 15, normed = False )
plt.title('Histogram', fontsize = 10 )
plt.show()
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.