PERSPECTIVE: We need practice in using files and strings and we don\'t have much
ID: 3699877 • Letter: P
Question
PERSPECTIVE: We need practice in using files and strings and we don't have much time. Thus this is a short assignment.
PROBLEM BACKGROUND: Assume you're working on a contract where a company is building a mailing list (or, rather, an e-mailing list) by analyzing e-mail messages. Your task is to write a Python program that reads a file (stored in the current working directory) called mail.dat and outputs to a file called addresses.dat, one per line, every e-mail address contained inside the file. (You can see now why this assignment is titled "The Spammer's Delight Problem".) For this assignment, if a whitespace delimited string of characters has an embedded commercial at sign (@) inside it (that is, interior to the string), we shall consider it an e-mail address. However, any trailing commas must be trimmed from addresses. Thus the string "abc@mtsu.edu," must appear in the output file as "abc@mtsu.edu" with the trailing comma removed. Only commas at the end of a string are considered trailing; do not remove non-trailing commas. Do not worry about any other punctuation characters; the only editing your program must do is to remove trailing commas.
IMPLEMENTATION NOTES: Use as your source file name "spammer.py". Don't forget to head your source file with the standard title line and global comments we use for all programming assignments. Be sure to have code to close() both files that you opened.
INPUT/OUTPUT: You should create your own data file(s) to initially test your program. For grading, link the file $PUB/mail.dat into your work area and use it. The command
ln —s $PUB/mail.dat mail.dat
will do that. (Be sure to link not copy the file into your work area.) For your output file, use the filename addresses.dat. Do not write anything to standard output (i.e., do not use any print() function calls in your final product). The first 10 lines of your addresses.dat file should look like:
Explanation / Answer
Python Executable Code :-
f=open('mail.dat','r') #open input file mail.dat in read mode
ff=open('addresses.dat','w') #open output file addresses.dat in write mode, as you have to write back the corrected email addresses to this file
for i in f: #iterate through all line in mail.dat
allWords=i.split(" ") #if a whitespace delimited string of characters has an embedded commercial at sign (@) inside it (that is, interior to the string), we shall consider it an e-mail address. As you have specified like this I'm assunming there will be only one email address.
email="" #initialising email to be found as an empty string
for j in allWords: #as space is the delimiter provided in question I'm splitting the string with space and iterating the list here
if '@' in j: #if @ is present then it will be an email address
email=j #store that email address
trailedComma=email.split(',')[0] #If at all email has trailing commas then trailedComma[0] will be our valid email needed, If email is already valid then this step will result email in a string then also we needed trailedComma[0]
ff.write(trailedComma+' ') #write this to output file addresses.dat
ff.close() #closing addresses.dat file opened
f.close() #closing mail.dat file opened
Input File :-
Anonymous1 Anonymous2 Anonymous3 virat.kohli@xyz.com Anoymous4 Anonymous5
Anonymous10 Anonymous11 Anonymous12 mandeep_singh@abc.com, Anonymous13 Anonymous14
Anonymous20 Anonymous21 brendonmecculam@xyz.com Anonymous22
Anonymous30 Anonymous31 Anonymous32 Anonymous33 Anonymous34 vohramanan@abc.com
Anonymous40 Anonymous41 Anonymous42 Anonymous43 karunnair304@xyz.com, Anonymous44
Output File :-
virat.kohli@xyz.com
mandeep_singh@abc.com
brendonmecculam@xyz.com
vohramanan@abc.com
karunnair304@xyz.com
If you found the answers useful kindly give thums up(really helpful).
Thanks.
Let me know in comments if you are facing any problems while running this program for your actual dataset.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.