Skip to content
Advertisement

How can I parse just a section of big log based on datetime in python

I wanted to parse just a section of any given log. I just need to start from start_time of my log and end at end_time of my code. The datetime format is “[2021-09-14 21:56:01.768]” So basically suppose I needed to start from “[2021-09-14 21:56:01.768]” part of log and end at “[2021-09-14 21:58:56.608]” and need to parse content between these two, parsing the content I have already written, but not understanding how to take this time section from code.

Sample log:

[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x00 CNTID=00 NSID=0001 FID=00^M
[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8018] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x00 CNTID=00 NSID=0001 FID=00^M
[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8019] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x01 CNTID=00 NSID=0000 FID=00^M
[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:801A] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M

Advertisement

Answer

This method below takes the start and end date along with the string logs and returns the logs in between two dates.

from datetime import datetime
import re

def get_interval_logs(start_time, end_time, logs):
    processed_logs = []
    pattern = re.compile(r'([[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3})](.+)')
    for (time, content) in re.findall(pattern, logs):
        log_time = datetime.strptime(time,"%Y-%m-%d %H:%M:%S.%f")
        if start_time < log_time and log_time < end_time:
            processed_logs.append(content)
    return "n".join(processed_logs) 

logs="""[2021-11-19 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x00 CNTID=00 NSID=0001 FID=00^M
[2021-11-20 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8018] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
[2021-11-21 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x00 CNTID=00 NSID=0001 FID=00^M
[2021-11-22 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8019] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
[2021-11-23 11:27:23.169] (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x01 CNTID=00 NSID=0000 FID=00^M
[2021-11-24 11:27:23.169] (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:801A] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M"""

start_time = datetime(2021,11,20)
end_time = datetime(2021,11,23)
print(get_interval_logs(start_time, end_time, logs))

It will print out:

 (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8018] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
 (Info)    (001473039)(2:0): AdminCmdIdentify CNS=0x00 CNTID=00 NSID=0001 FID=00^M
 (Info)    (001473039)(2:0): Host_TransferAdminData: [FWCMD_A:8019] [HWCMD_A:8800] pBuffer:0x7FF02000 xferCount:0x1000 autoFreeBuffer:1 handlerFptr:0x0 direction:0 ^M
Advertisement