I hope you are well. I’m trying to find every instance of “Fix” in release notes, and grab the date (which is usually a line above).
For example (section of release notes):
JavaScript
x
3
1
Date: : 2021-04-26
2
Comment: : Fix wrong
3
I would want to grab the amount of “fix” occurrences and their corresponding date.
Example output:
JavaScript
1
6
1
Date: 2021-02-13
2
Fixes: 3
3
Date: 2021-01-18
4
Fixes: 1
5
Etc
6
Here is what I’ve been trying:
JavaScript
1
45
45
1
WANTEDdate = 14
2
count = 0
3
dates = []
4
with open(file_name) as searchfile:
5
6
for line in searchfile:
7
8
left,sep,right = line.partition('Date')
9
10
if sep:
11
12
count = count + 1
13
14
temp = ((right[:WANTEDdate]))
15
16
# temp = ((left[BEFORE:]))
17
18
temp = temp.strip()
19
20
temp = temp.strip(" :")
21
22
if len(temp) > 0:
23
24
print(temp)
25
26
dates.append(temp)
27
28
#print("Fix",count,temp)
29
30
#print(temp)
31
32
33
34
# test
35
lookup = "Fix"
36
with open(file_name) as myFile:
37
38
for num, line in enumerate(myFile, 1):
39
40
if lookup in line:
41
42
dateLine = num - 1
43
44
print("Found at line:", num)
45
Any help would be greatly appreciated. Thanks in advance.
Advertisement
Answer
Assuming your file looks like this:
JavaScript
1
12
12
1
Date: : 2021-04-26
2
Comment: : Fix wrong 1
3
4
Date: : 2021-04-26
5
Comment: : ---
6
7
Date: : 2021-04-26
8
Comment: : Fix wrong 2
9
10
Date: : 2021-04-27
11
Comment: : Fix wrong 3
12
Then you can use re
module to parse it:
JavaScript
1
18
18
1
import re
2
3
with open("your_file.txt", "r") as f_in:
4
data = f_in.read()
5
6
dates = re.findall(r"Date.*:s*(.*)", data)
7
comments = re.findall(r"Comment.*:s*(.*)", data)
8
9
out = {}
10
for d, c in zip(dates, comments):
11
if "fix" in c.lower():
12
out.setdefault(d, 0)
13
out[d] += 1
14
15
for k, v in out.items():
16
print("Date:", k)
17
print("Fixes:", v)
18
Prints:
JavaScript
1
5
1
Date: 2021-04-26
2
Fixes: 2
3
Date: 2021-04-27
4
Fixes: 1
5