Parsing text file containing unique pattern using Python

Tags: , ,



How to parse a text file containing this pattern “KEYWORD: Out:” and dump the result into output file using Python?

input.txt

DEBUG 2020-11:11:17.401 KEYWORD: Out:0xaaaf0000 In:0x80000000.1110ffff.
DEBUG 2020-11:11:17.401 KEYWORD: Out:0xaaaf00cc In:0x80000000.1110ffaa.

output.txt

0xaaaf0000:1110ffff 
0x80000000:1110ffaa

Answer

You could use a regex:

import re 

txt='''
DEBUG 2020-11:11:17.401 KEYWORD: Out:0xaaaf0000 In:0x80000000.1110ffff.
DEBUG 2020-11:11:17.401 KEYWORD: Out:0xaaaf00cc In:0x80000000.1110ffaa.'''

pat=r'KEYWORD: Out:(0x[a-f0-9]+)[ t]+In:0x[a-f0-9]+.([a-f0-9]+)'

>>> 'n'.join([m[0]+':'+m[1] for m in re.findall(pat, txt)])
0xaaaf0000:1110ffff
0xaaaf00cc:1110ffaa

If you want to do this line-by-line from a file:

import re

pat=r'KEYWORD: Out:(0x[a-f0-9]+)[ t]+In:0x[a-f0-9]+.([a-f0-9]+)'

with open(ur_file) as f:
    for line in f:
        m=re.search(pat, line) 
        if m:
            print(m.group(1)+':'+m.group(2))


Source: stackoverflow