Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

PYTHON HELP!! :( Create text analyzer that processes text and provides informati

ID: 3763648 • Letter: P

Question

PYTHON HELP!! :(

Create text analyzer that processes text and
provides information about its word contents.
One thing it should be able to support is the ability to
create a report that shows a count of how many times each word
occurs in the text. The report should be sorted, with a
primary sort of word length, and a secondary ASCII sort.
The code should be production quality and written as if
it were part of a utility library.

Extra credit given for solutions that do not rely on Java Collections.
Example:
Input:
The quick brown fox jumped over the lazy brown dog’s back
Output:
1 The
1 fox
1 the
1 back
1 lazy
1 over
2 brown
1 dog’s
1 quick
1 jumped

Currently I have:

from collections import Counter

file = open("text.txt","r")

#read the file (The quick brown fox jumped over the lazy brown dog’s back) & split words
wordcount =Counter(file.read().split())

#printing word count
for item in wordcount.items():

print ("{} {}".format(*item))

#I don't know how to sort by word size, nor how to get rid of the weird symbols next to "dog's" . Please help me :(

Explanation / Answer

1. To sort by word size: You can do the following:

    a) First extract the length of each word and store it in some object. For extracting word length, you can use len() function.

Ex.: len("Hello Word")

This will return 11.

   b) Then create list of word tuples (mapping of each word with its length). For ex.

word_tuples=[

('The',1),

('quick', 5),

.....

....

]

After this you can used sorted() function to sort the list based on a key. For ex.:

sorted(word_tuples, key = lambda wordlength : wordlength[1])

2. Getting rid of unwanted symbols: To remove leading white spaces, strip() function can be used. To replace a symbol, replace() function can be used. For ex.:

str=" Hello World "

str.strip()

str="@How are you?@"

str.replace("@","")

First will strip leading whitespaces, while second will replace "@" symbol with "".