I hope you are well. I’m trying to find every instance of “Fix” in release notes, and grab the date (which is usually a line above).
For example (section of release notes):
Date: : 2021-04-26 Comment: : Fix wrong ...
I would want to grab the amount of “fix” occurrences and their corresponding date.
Example output:
Date: 2021-02-13 Fixes: 3 Date: 2021-01-18 Fixes: 1 Etc...
Here is what I’ve been trying:
WANTEDdate = 14
count = 0
dates = []
with open(file_name) as searchfile:
for line in searchfile:
left,sep,right = line.partition('Date')
if sep:
count = count + 1
temp = ((right[:WANTEDdate]))
# temp = ((left[BEFORE:]))
temp = temp.strip()
temp = temp.strip(" :")
if len(temp) > 0:
print(temp)
dates.append(temp)
#print("Fix",count,temp)
#print(temp)
# test
lookup = "Fix"
with open(file_name) as myFile:
for num, line in enumerate(myFile, 1):
if lookup in line:
dateLine = num - 1
print("Found at line:", num)
Any help would be greatly appreciated. Thanks in advance.
Advertisement
Answer
Assuming your file looks like this:
Date: : 2021-04-26 Comment: : Fix wrong 1 Date: : 2021-04-26 Comment: : --- Date: : 2021-04-26 Comment: : Fix wrong 2 Date: : 2021-04-27 Comment: : Fix wrong 3
Then you can use re module to parse it:
import re
with open("your_file.txt", "r") as f_in:
data = f_in.read()
dates = re.findall(r"Date.*:s*(.*)", data)
comments = re.findall(r"Comment.*:s*(.*)", data)
out = {}
for d, c in zip(dates, comments):
if "fix" in c.lower():
out.setdefault(d, 0)
out[d] += 1
for k, v in out.items():
print("Date:", k)
print("Fixes:", v)
Prints:
Date: 2021-04-26 Fixes: 2 Date: 2021-04-27 Fixes: 1