I have a multiple word file where the same letters appear multiple times. I’ve already learned to catch these words.
Now I would like the words with the letter “a” not to be counted by the script.
My file.txt:
abac test testtest dog cat one doog helo hello abaa abba
my code:
li = [] for string in open("test.txt", 'r', encoding='utf-8'): count = 0 for qq in range(0, len(string)): if count == 1: break for zz in range(qq + 1, len(string)): if string[qq] == string[zz]: count = 1 if count == 1: li.append(string.replace("n", "")) break print(li)
result:
['test', 'testtest', 'doog', 'hello', 'abaa', 'abba']
I try to make that only “a” can repeat in a word, if “a” is repeated and another letter, this word is to be extracted
Expects to not recognize the word “abaa” as a result Because, in this word, only “a” is repeated. No other letter repeated.
If the “a” and another letter are repeated then the word is to be extracted in this case
Advertisement
Answer
If you don’t want to catch repeated a then if
it out!
if string[qq] == string[zz] and string[qq] and string[qq] != "a": count = 1 print(li)
But if you don’t mind, your program could be improved.
Firstly, and string[qq]
has no effect – for normal letters it always evaluates to True
Secondly, your count
(unless you plan to extend the program to allow different number of counts) could be a boolean,
letter_repeated = False if (...): letter_repeated = True
And as a bonus, you have a Counter in python which generally do what you want:
li = [] max_count = 1 for string in open("text.txt", "r", encoding="utf-8"): c = Counter(string) # you can modify that counter by e.g removing "a" if c.most_common(1)[0][1] > max_count: li.append(string.replace("n", "")) print(li)