Write a program that shall eliminate duplicate words from a text file. The progr
ID: 3878010 • Letter: W
Question
Write a program that shall eliminate duplicate words from a text file. The program shall: 1. Ask the user to enter a name of an existing text file; if the file does not exist or cannot be opened for any other reason, the program shall print an error message and terminate. Read the content of the file and eliminate all duplicate words, except for the first occurrence. For the purpose of this exercise, a word is any sequence of characters that does not contain any spaces, and two words are duplicates if they differ at most in the character case (e.g., "Mary" and "mArY are duplicates) Write the text without duplicates into the file nodups-XXX, where XXX is the name of the 2. 3. e program shall print an error message and terminate. Test program by removing duplicates from Othello: 4. http://www.gutenberg.org/cache/epub/2267/pg2267-images.htmlExplanation / Answer
Program to eliminate duplicate words from a text file:
1. Checking whether filename entered by the user exists or not.
import os
filename=input("Enter file name: ")
print(filename)
if os.path.isfile(filename):
print("file exists")
else:
print("File Does Not exist or there is some problem opening a file.")
exit()
2. To remove duplicates:
import os
#from collections import OrderedDict
filename=input("Enter file name to remove duplicates: ")
print(filename)
li=[]
li1 = []
if os.path.isfile(filename):
print("file exists")
f=open(filename,'r')
for line in f:
for word in line.split():
word=word.lower() # converting all strings or words of text file into lowercase letters
print(word)
li.append(word) # adding all words to a list
print(li)
for word in li:
if word not in li1:
li1.append(word) #appending nonduplicate words to list li1.
print(li1)
else:
print("File Does Not exist or there is some problem opening a file.")
exit()
3. Writing list of non-duplicate strings to a text file:
import os
filename=input("Enter file name to remove duplicates: ")
print(filename)
li=[]
li1 = []
if os.path.isfile(filename):
print("file exists")
f=open(filename,'r')
for line in f:
for word in line.split():
word=word.lower()
print(word)
li.append(word)
print(li)
for word in li:
if word not in li1:
li1.append(word)
print(li1)
nondups_filename=open('non_duplicate.txt','w')
if os.path.isfile(filename):
#print("file exists")
for item in li1:
nondups_filename.write("%s " %item)
nondups_filename.close()
else:
print("File not created or there is some problem opening a file.")
exit()
else:
print("File Does Not exists or there is some problem opening a file.")
exit()
4. I did copy and paste the text of the book Othello by William Shakespere into a text file and saved it into my system in the same folder where a script is stored. And then tested. It was taking a lot of time as the file was large.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.