I am aware that there are many solutions to this kind of question. However, none of them seems to have helped with my case. This is the code I’m referring to:
from nltk.book import text4
def length_frequency(length):
'''
Parameter: length as an integer
'''
# finds words in given length
counter = 0
word_in_length = {}
for word in text4:
if len(word) == length and word not in word_in_length:
word_in_length[word] = text4.count(word)
for key in word_in_length:
if word_in_length[key] > counter:
max = word_in_length[key]
counter = max
max_word = key
print(f'The most frequent word with {length} characters is "{max_word}".nIt occurs {counter} times.')
length_frequency(7)
Output:
The most frequent word with 7 characters is "country".
It occurs 312 times.
When I try this code in PyCharm, it works without problems. However, if I use it via command line call it gives this error:
Traceback (most recent call last):
File "program5.py", line 67, in <module>
main()
File "program5.py", line 60, in main
length_frequency(input_length)
File "program5.py", line 35, in length_frequency
print(f'The most frequent word with {length} characters is "{max_word[0]}".nIt occurs {counter} times.')
UnboundLocalError: local variable 'max_word' referenced before assignment
Of course, for the command line call I import sys and use sys.argv as an argument for length. I have tried adding global max_word at the beginning of the function, but it does not work. I have not assigned any variable like max_word before this function.
Advertisement
Answer
Add some error checking to the function to help you debug:
def length_frequency(length: int) -> None:
'''
Parameter: length as an integer
'''
assert isinstance(length, int), f"{repr(length)} is not an int!"
word_counts = {word: text4.count(word) for word in set(text4) if len(word) == length}
assert word_counts, f"No words in corpus with length {length}!"
max_word = max(word_counts.keys(), key=word_counts.get)
print(f"The most frequent word with {length} characters is {max_word}")
(I simplified the implementation a bit just for my own benefit in making it easier to understand — I’m pretty sure it does the same thing with less confusion.)
Note that adding type annotations also means that if you had a line of code like, say:
length_frequency(sys.argv[1])
if you were to run mypy
it would tell you about the error, no assert
required:
test.py:19: error: Argument 1 to "length_frequency" has incompatible type "str"; expected "int"