I first took this data in a text file below:
text in data.txt: text: 8473 second: second text 3rd: 23-54-65-87 txt: 583 sec: sec text 3) 5343-436654-98989
And I changed it into a list like this:
['text:', '8473', 'second:', 'second', 'text', '3rd:', '23-54-65-87', 'txt:', '583', 'sec:', 'sec', 'text', '3)', '5343-436654-98989']
I then removed everything with a colon. And my next step in the program below is to manually merge ‘second’ and ‘text’ and then ‘sec’ and ‘text’. This wont work in a program with an undetermined number of data in the above format. So I want to do this as a loop that can produce the following result(note that now ‘second text’ and ‘sec text’ are one item:
['8473', 'second text', '23-54-65-87', '583', 'sec text', '5343-436654-98989']
But I can only find ways to merge every pair of items, but I can’t find a way to merge an item with the next item every 3 items like I want to….
Here is the program so far:
file = 'data.txt'
corrected = []
one = []
two = []
three = []
full = [one, two, three]
with open(file, 'r') as f:
contents = f.read()
list = contents.split()
print(list)
for item in list:
if ":" not in item:
if ")" not in item:
corrected.append(item)
**corrected[1] = corrected[1] + ' ' + corrected[2]
del corrected[2]
corrected[4] = corrected[4] + ' ' + corrected[5]
del corrected[5]
print(f"{corrected}nnnnnnnnnnnnnnnnnnnnn")**
for item in corrected[::3]:
one.append(item)
for item in corrected[1::3]:
two.append(item)
for item in corrected[2::3]:
three.append(item)
index = 1
for item in full:
print(f"{index}:{item}")
index += 1
Current and desired Resulting output:
1:['8473', '583']
2:['second text', 'sec text']
3:['23-54-65-87', '5343-436654-98989']
Advertisement
Answer
Sounds to me like you just want to discard the first word of every row:
text='''text: 8473
second: second text
3rd: 23-54-65-87
txt: 583
sec: sec text
3) 5343-436654-98989'''
result = [row.split(maxsplit=1)[1] for row in text.split('n') if row]
print(result)
# ['8473', 'second text', '23-54-65-87', '583', 'sec text', '5343-436654-98989']
If the text is read from a file, then the split on 'n' is implicit:
with open('data.txt', 'r') as f:
result = [row.split(maxsplit=1)[1] for row in f if row]