I am trying to sort txt file which looks like that :
JavaScriptx14141byr:1983 iyr:2017
2pid:796082981 cid:129 eyr:2030
3ecl:oth hgt:182cm
4
5iyr:2019
6cid:314
7eyr:2039 hcl:#cfa07d hgt:171cm ecl:#0180ce byr:2006 pid:8204115568
8
9byr:1991 eyr:2022 hcl:#341e13 iyr:2016 pid:729933757 hgt:167cm ecl:gry
10
11hcl:231d64 cid:124 ecl:gmt eyr:2039
12hgt:189in
13pid:#9c3ea1
14
and so on(+1000 lines), to that structure:
JavaScript118181byr:value
2iyr:value
3eyr:value
4hgt:value
5hcl:value
6ecl:value
7pid:value
8cid:value
9
10byr:value
11iyr:value
12eyr:value
13hgt:value
14hcl:value
15ecl:value
16pid:value
17cid:value
18
byr, iyr etc. order doesn’t matter, but every “set” of key:value has to be seperated by blank line. My main problem, if I can call it that way, is to create piece of code that sorts the file properly when there is more than one key:value element, I managed to make some progress, but it is still not as it should be – the following code:
result_file = open('testresult.txt', 'w')
#list_of_lines = [] testing purpose
with open('input.txt', 'r') as f:
for line in f:
if line == "n":
#list_of_lines.append('n') testing
result_file.writelines('n')
else:
for i in line.split(' '):
if i[-1] == "n":
result_file.write(i)
else:
result_file.write(i + 'n')
#print(i) testing purpose
is making result as below:
byr:1983
iyr:2017
pid:796082981
cid:129
eyr:2030
ecl:oth
hgt:182cm
iyr:2019
cid:314
eyr:2039
hcl:#cfa07d
hgt:171cm
ecl:#0180ce
byr:2006
pid:8204115568
byr:1991
eyr:2022
hcl:#341e13
iyr:2016
pid:729933757
hgt:167cm
ecl:gry
and as you can see it doesn’t work properly – for example there should be no blank line between first occurrence of byr and first occurrence of hgt and so on. It seemed to me that the last if statement
if i[-1] == "n":
result_file.write(i)
else:
result_file.write(i + 'n')
is protecting me from such situation, but now I totally don’t get why isn’t it as I “predicted”. Please help. Thanks from advance <3
Advertisement
Answer
Try this
lines = []
with open("file.txt", "r") as f:
lines = f.readlines()
print(lines)
splited_lines = []
for line in lines:
[ splited_lines.append(splited) for splited in line.split(" ")]
print("splitted_lines")
print(splited_lines)
# notice every occurence in splitted_lines has a 'n',
# that might be causing your more then on newline problem,
# lets remove that
cleaned_lines = []
[cleaned_lines.append(splited.strip("n")) for splited in splited_lines]
print("Removed /n")
print(cleaned_lines)
with open("output.txt", "w") as f:
for line in cleaned_lines:
f.write(line+"n")
Having this in file.txt :
byr:1983 iyr:2017
pid:796082981 cid:129 eyr:2030
ecl:oth hgt:182cm
iyr:2019
cid:314
eyr:2039 hcl:#cfa07d hgt:171cm ecl:#0180ce byr:2006 pid:8204115568
byr:1991 eyr:2022 hcl:#341e13 iyr:2016 pid:729933757 hgt:167cm ecl:gry
hcl:231d64 cid:124 ecl:gmt eyr:2039
hgt:189in
pid:#9c3ea1
Running the above script gives me this in output.txt:
byr:1983
iyr:2017
pid:796082981
cid:129
eyr:2030
ecl:oth
hgt:182cm
iyr:2019
cid:314
eyr:2039
hcl:#cfa07d
hgt:171cm
ecl:#0180ce
byr:2006
pid:8204115568
byr:1991
eyr:2022
hcl:#341e13
iyr:2016
pid:729933757
hgt:167cm
ecl:gry
hcl:231d64
cid:124
ecl:gmt
eyr:2039
hgt:189in
pid:#9c3ea1
Hope this is what you needed ?