Skip to content
Advertisement

Remove text lines and strip lines with condition in python

I have a text file in this format:

000000.png 712,143,810,307,0
000001.png 599,156,629,189,3 387,181,423,203,1 676,163,688,193,5
000002.png 657,190,700,223,1
000003.png 614,181,727,284,1
000004.png 280,185,344,215,1 365,184,406,205,1

I want to remove the lines that don’t have a [number1,number2,number3,number4,1] or [number1,number2,number3,number4,5] ending and also strip the text line and remove the [blocks] -> [number1,number2,number3,number4,number5] that don’t fulfill this condition.

The above text file should look like this in the end:

000001.png 387,181,423,203,1 676,163,688,193,5
000002.png 657,190,700,223,1
000003.png 614,181,727,284,1
000004.png 280,185,344,215,1 365,184,406,205,1

My code:

import os

with open("data.txt", "r") as input:
    with open("newdata.txt", "w") as output:
        # iterate all lines from file
        for line in input:
            # if substring contain in a line then don't write it
            if ",0" or ",2" or ",3" or ",4" or ",6" not in line.strip("n"):
                output.write(line)

I have tried something like this and it didn’t work obviously.

Advertisement

Answer

No need for Regex, this might help you:

with open("data.txt", "r") as input:        # Read all data lines.
    data = input.readlines()
with open("newdata.txt", "w") as output:    # Create output file.
    for line in data:                       # Iterate over data lines.
        line_elements = line.split()        # Split line by spaces.
        line_updated = [line_elements[0]]   # Initialize fixed line (without undesired patterns) with image's name.
        for i in line_elements[1:]:         # Iterate over groups of numbers in current line.
            tmp = i.split(',')              # Split current group by commas.
            if len(tmp) == 5 and (tmp[-1] == '1' or tmp[-1] == '5'):
                line_updated.append(i)      # If the pattern is Ok, append group to fixed line.
        if len(line_updated) > 1:           # If the fixed line is valid, write it to output file.
            output.write(f"{' '.join(line_updated)}n")
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement