I hope you are well. I’m trying to find every instance of “Fix” in release notes, and grab the date (which is usually a line above).
For example (section of release notes):
Date: : 2021-04-26 Comment: : Fix wrong ...
I would want to grab the amount of “fix” occurrences and their corresponding date.
Example output:
Date: 2021-02-13 Fixes: 3 Date: 2021-01-18 Fixes: 1 Etc...
Here is what I’ve been trying:
WANTEDdate = 14 count = 0 dates = [] with open(file_name) as searchfile: for line in searchfile: left,sep,right = line.partition('Date') if sep: count = count + 1 temp = ((right[:WANTEDdate])) # temp = ((left[BEFORE:])) temp = temp.strip() temp = temp.strip(" :") if len(temp) > 0: print(temp) dates.append(temp) #print("Fix",count,temp) #print(temp) # test lookup = "Fix" with open(file_name) as myFile: for num, line in enumerate(myFile, 1): if lookup in line: dateLine = num - 1 print("Found at line:", num)
Any help would be greatly appreciated. Thanks in advance.
Advertisement
Answer
Assuming your file looks like this:
Date: : 2021-04-26 Comment: : Fix wrong 1 Date: : 2021-04-26 Comment: : --- Date: : 2021-04-26 Comment: : Fix wrong 2 Date: : 2021-04-27 Comment: : Fix wrong 3
Then you can use re
module to parse it:
import re with open("your_file.txt", "r") as f_in: data = f_in.read() dates = re.findall(r"Date.*:s*(.*)", data) comments = re.findall(r"Comment.*:s*(.*)", data) out = {} for d, c in zip(dates, comments): if "fix" in c.lower(): out.setdefault(d, 0) out[d] += 1 for k, v in out.items(): print("Date:", k) print("Fixes:", v)
Prints:
Date: 2021-04-26 Fixes: 2 Date: 2021-04-27 Fixes: 1