Skip to content
Advertisement

How can i delete the repeated numbers from a file?

i need to delete numbers that are iterated more than once .. if i know the repeated numbers already. Should i have two files .. one for the real numbers i have ” data .txt “. and the other contains the numbers which are iterated ” columns.txt” .. it is just for searching quickly the repeated numbers – As the file is too huge to delete it manually so i saved the repeated numbers in a file .. how can i make right loop for searching from a file to another file and if it find it .. just delete the repeated and keep one of them only the code i tried to implement like

infile = "/home/user/Desktop/data.txt"
outfile = "/home/user/Desktop/new.txt"
numbers="/home/user/Desktop/columns.txt"
with open(infile) as fin, open(outfile, "w+") as ft:
    for line in fin:
        for number in numbers:
                line = line.replace(number, "")
        ft.write(line)

but still have a problem .. the loop deleted all numbers that are iterated and i need to keep one of them only .. not delete all repeated

data.txt 

53.74270106
60.45828786
50.08396881
119.2588545
119.2588545
119.2588545
119.2588545
119.2588545
119.2588545
119.2588545
8.391147123
3.998351513

it should be like this

53.74270106
60.45828786
50.08396881
119.2588545
8.391147123
3.998351513

i need to delete the number if it appears only sequentially

Advertisement

Answer

If you are on python 3.6+ then this solution will work for you. This way you dont need to have a list where you already know the repeated numbers, python will do that for you.

If you are not using python 3.6+ then change dict.fromkeys to collections.OrderedDict.fromkeys.

with open('data.txt') as file:
    lines = dict.fromkeys(line.strip() for line in file)

with open('out.txt', 'w') as file:
    file.writelines('n'.join(lines))

Output

53.74270106
60.45828786
50.08396881
119.2588545
8.391147123
3.998351513
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement