I want to take every word from a text file, and count the word frequency in a dictionary.
Example: 'this is the textfile, and it is used to take words and count'
d = {'this': 1, 'is': 2, 'the': 1, ...}
I am not that far, but I just can’t see how to complete it. My code so far:
import sys argv = sys.argv[1] data = open(argv) words = data.read() data.close() wordfreq = {} for i in words: #there should be a counter and somehow it must fill the dict.
Advertisement
Answer
If you don’t want to use collections.Counter, you can write your own function:
import sys filename = sys.argv[1] fp = open(filename) data = fp.read() words = data.split() fp.close() unwanted_chars = ".,-_ (and so on)" wordfreq = {} for raw_word in words: word = raw_word.strip(unwanted_chars) if word not in wordfreq: wordfreq[word] = 0 wordfreq[word] += 1
for finer things, look at regular expressions.