Skip to content
Advertisement

Structure Android LogCat Text File to Structured Pandas DF

I want to convert lines of LogCat Text Files to structured Pandas DF. I cannot seem to properly conceptualize how I am going to do this…Here’s my basic pseudo-code:

dateTime = []
processID = []
threadID = []
priority = []
application = []
tag = []
text = []

logFile = "xxxxxx.log"

for line in logfile:
     split the string according to the basic structure
     dateTime = [0]
     processID = [1]
     threadID = [2]
     priority = [3]
     application = [4]
     tag = [5]
     text = [6]
     append each to the empty list above

write the lists to pandas dataframe & add column names

The problem is: I do not know how to properly define the delimiter with this structure

08-01 14:28:35.947 1320 1320 D wpa_xxxx: wlan1: skip–ssid

Advertisement

Answer

import re
import pandas as pd

ROW_PATTERN = re.compile(r"""(d{2}-d{2} d{2}:d{2}:d{2}.d+) (d+) (d+) ([A-Z]) (S+) (S+) (S+)""")

with open(logFile) as f:
    s = pd.Series(f.readlines())

df = s.extract(ROW_PATTERN)
df.columns = ['dateTime', 'processID', 'threadID', 'priority', 'application', 'tag', 'text']

This will read each line of logFile into a row in a Series, which can then be expanded into a DataFrame via each group in the regular expression. This assumes that 08-01 14:28:35.947 is the first value in each row and that subsequent values are separated by white space.

Advertisement