I’m just learning python, and I’m having some problems reading a .txt file that I created. My objective: I have a txt file with a list of strings. I’m trying to read, process it and save every letter into a new list.
example2.txt file:
[one, two, THREE, one, two, ten, eight,cat, dog, bird, fish] [Alonso, Alicia, Bob, Lynn] , [red, blue, green, pink, cyan]
My output [‘one, two, THREE, one, two, ten, eight, cat, dog, bird, fish]n’] [‘Alonso, Alicia, Bob, Lynn], [red, blue, green, pink, cyan’]
What I was expecting was something like this:
['one','two','THREE','one','two','ten','eight','cat','dog','bird','fish','Alonso','Alicia','Bob','Lynn','red','blue','green','pink','cyan']
My code in python This is what I tried; you can ignore the comments
import re # Creating a variable to store later the contents of the file list_String = [] # Reading the file file = open("D:direxample2", "r") for line in file: print(re.split('^[s].', line.strip(' ]['))) #list_String.append(line.strip('[]').strip("n").split(",")) #list_String = re.split(r'[^St.]', line) #print(line.split(r"S")) #print(line) #print(list_String) file.close()
I also was reading the documentation on how to use re, but I don’t know if it is just me or is hard to understand.
I tried experimenting with what I read, but I’m still not getting what I wanted.
I even try this:
print(line.strip('][').strip('n').strip(']').split(","))
Output
['one', ' two', ' THREE', ' one', ' two', ' ten', ' eight', 'cat', ' dog', ' bird', ' fish'] ['Alonso', ' Alicia', ' Bob', ' Lynn] ', ' [red', ' blue', ' green', ' pink', ' cyan']
As you can see, it kind of works. However, between Lynn and red, the braces and the comma do not disappear somehow.
Thank you for the time and help
Advertisement
Answer
You might just find that doing an re.findall
on the pattern w+
works here:
inp = "[one, two, THREE, one, two, ten, eight,cat, dog, bird, fish] [Alonso, Alicia, Bob, Lynn] , [red, blue, green, pink, cyan]" words = re.findall(r'w+', inp) print(words)
This prints:
['one', 'two', 'THREE', 'one', 'two', 'ten', 'eight', 'cat', 'dog', 'bird', 'fish', 'Alonso', 'Alicia', 'Bob', 'Lynn', 'red', 'blue', 'green', 'pink', 'cyan']