Problem 1 (Text Analysis) Create a Python file called analysis.py that will perf

ID: 3847098 • Letter: P

Question

Problem 1 (Text Analysis) Create a Python file called analysis.py that will perform text analysis on files. For this question, assume that each space in the document separates one word from the next so any use of the term 'word' means a string that occurs between two spaces (or in two special cases, between the start of the file and a space, or between a space and the end of the file). You can also assume there is no punctuation or other symbols present in the files only words separated by spaces. If you want to see examples of the type of text, look in the testfile .txt files included on cuLearn. You must implement and test the following functions inside of your analysis.py file 1) load(str) Takes a single string argument, representing a filename. The program must open the file and parse the text inside. This function should initialize the variables (e.g., lists, dictionaries, other variables) you need to solve the remainder of the problem. This way, the file contents can be parsed once and the functions below can be executed many times without re-reading the file, which is a slow process This function should also remove any information stored from a previous file when it is called (i.e., you start from nothing every time load is called) 2) commonword(list) -Takes a single list-type argument which contains string values. The function should operate as follows: a. If the list is empty or none of the words specified in the list occur in the text that has been loaded, the function should return None b. Otherwise, the function should return the word contained in the list that occurs most often in the loaded text or any one of the most common, in the case of a tie 3) commonletter(list) Takes a single list-type argument which contains single character strings (i.e., etters/characters). The function should operate as follows: a. If the list is empty or none of the letters specified in the list occur in the text that has been loaded, the function should return None

Explanation / Answer

NOTE: Due to lack of time i was not able to implement the function commonpair(). Sorry for that.

Code:
#!/usr/bin/python

# Program to analyze text files

# Main function which triggers all other functions
def main():
   # Asking user to enter the filename as input
   filename = raw_input("Enter the filename: ")
   load(filename)
   #print " Given file after transforming to dictionary is " + str(word_dict)
   #print " Given file after transforming to char dictonary is " + str(char_dict)

   # sample list to test the commonword function
   list = ['how', 'dict', 'main', 'to', 'are']
   cword = commonword(list)
   print " Most common word is: " + str(cword)

   # sample list of characters to test the commonletter function
   list = ['a', 'b', 'c', 'd', 'e', 'f']
   cletter = commonletter(list)
   print " Most common letter is: " + str(cletter)

   # counting total words and unique word count
   word_cnt = countall()
   uniq_cnt = countunique()
   print " Total word count is: " + str(word_cnt)
   print " Unique word count is: " + str(uniq_cnt)

# function to return word count
def countall():
   if len(word_list) > 0:
       return len(word_list)
   else:
       return None

# function to return unique word count
def countunique():
   if len(word_dict) > 0:
       return len(word_dict)
   else:
       return None

# function to return most common letter of the list from load text
def commonletter(list):
   if len(list) > 0:
       mcomm_char = ""
       mcomm_freq = 0
       for i in range(0, len(list)):
           char = list[i]
           # checking if char fetched from list is available in char_dict and is it the most common character
           if char in char_dict and char_dict[char] > mcomm_freq:
               mcomm_freq = char_dict[char]
               mcomm_char = char
   else:
       return None

return mcomm_char

# function to return most common word of the list from loaded text
def commonword(list):
   if len(list) > 0:
       mcomm_word = ""
       mcomm_freq = 0
       for i in range(0, len(list)):
           # checking if word fetched from the list is available in word_dict and is it the most common word
           word = list[i]
           if word in word_dict and word_dict[word] > mcomm_freq:
               mcomm_freq = word_dict[word]
               mcomm_word = word
   else:
      return None

return mcomm_word

# function to load contents from file to a list
def load(str):
   # declaring and initializing word_dict, word_list, char_dict as these are used across the functions
   global word_dict, word_list, char_dict
   word_dict = {}
   char_dict = {}
   word_list = []

   # opening the file in read mode
   with open(str, "r") as fp:
       # iterating through each line in file
       for line in fp:
           # converting the line into words and store it in line_list
           line_list = line.rstrip(' ').split()
           for i in range(0, len(line_list)):
               word = line_list[i]
               # fetching each word and storing in list
               word_list.append(word)
               # creating a dictionary of words
               if word in word_dict:
                   word_dict[word] = word_dict[word] + 1
               else:
                   word_dict[word] = 1
               # fetching each character from word and creating dictionary of characters
               for char in word:
                   if char in char_dict:
                       char_dict[char] = char_dict[char] + 1
                   else:
                       char_dict[char] = 1
   fp.close()

if __name__=='__main__':
main()

Execution and output:
Unix Terminal> cat testfile
hello how are you
why are things
Unix Terminal> python analysis.py
Enter the filename: testfile

Most common word is: are

Most common letter is: e

Total word count is: 7

Unique word count is: 6
Unix Terminal>

Navigate

Problem 1 (String slicing) Write a function called date that takes a single stri

Problem 1 (Text book, problem 4, programming projects, chapter 2, page 158) Weir

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Problem 1 (Text Analysis) Create a Python file called analysis.py that will perf

Question

Explanation / Answer

Related Questions

Navigate