I first took this data in a text file below:
text in data.txt: text: 8473 second: second text 3rd: 23-54-65-87 txt: 583 sec: sec text 3) 5343-436654-98989
And I changed it into a list like this:
['text:', '8473', 'second:', 'second', 'text', '3rd:', '23-54-65-87', 'txt:', '583', 'sec:', 'sec', 'text', '3)', '5343-436654-98989']
I then removed everything with a colon. And my next step in the program below is to manually merge ‘second’ and ‘text’ and then ‘sec’ and ‘text’. This wont work in a program with an undetermined number of data in the above format. So I want to do this as a loop that can produce the following result(note that now ‘second text’ and ‘sec text’ are one item:
['8473', 'second text', '23-54-65-87', '583', 'sec text', '5343-436654-98989']
But I can only find ways to merge every pair of items, but I can’t find a way to merge an item with the next item every 3 items like I want to….
Here is the program so far:
file = 'data.txt' corrected = [] one = [] two = [] three = [] full = [one, two, three] with open(file, 'r') as f: contents = f.read() list = contents.split() print(list) for item in list: if ":" not in item: if ")" not in item: corrected.append(item) **corrected[1] = corrected[1] + ' ' + corrected[2] del corrected[2] corrected[4] = corrected[4] + ' ' + corrected[5] del corrected[5] print(f"{corrected}nnnnnnnnnnnnnnnnnnnnn")** for item in corrected[::3]: one.append(item) for item in corrected[1::3]: two.append(item) for item in corrected[2::3]: three.append(item) index = 1 for item in full: print(f"{index}:{item}") index += 1
Current and desired Resulting output: 1:['8473', '583'] 2:['second text', 'sec text'] 3:['23-54-65-87', '5343-436654-98989']
Advertisement
Answer
Sounds to me like you just want to discard the first word of every row:
text='''text: 8473 second: second text 3rd: 23-54-65-87 txt: 583 sec: sec text 3) 5343-436654-98989''' result = [row.split(maxsplit=1)[1] for row in text.split('n') if row] print(result) # ['8473', 'second text', '23-54-65-87', '583', 'sec text', '5343-436654-98989']
If the text is read from a file, then the split on 'n'
is implicit:
with open('data.txt', 'r') as f: result = [row.split(maxsplit=1)[1] for row in f if row]