I have some text input and I want to extract few information from the text. For that, I am trying to use Regular Expression and am able to do that except for two fields- rent and transfer.
The input text is as below-
JavaScript
x
3
1
my_str = "19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00
2
20 Aug transfer from John wick saving a/c 200.00 130.90"
3
Now I want to extract rent like- rent 500.00
and transfer as transfer 200.00
but somehow only ‘rent’ and ‘transfer’ keywords are extracting only.
Below is my code in Python for the same-
JavaScript
1
5
1
import re
2
find_rent = re.search(r"(rent)+([0-9,.]*)", my_str)
3
found = find_rent.group()
4
print(found)
5
With the above code, only ‘rent’ is extracted not ‘rent 500.00’. Similar code I am using for transfer also.
Please guide me on what I am doing wrong here.
Advertisement
Answer
You can use
JavaScript
1
2
1
b(transfer|rent)D+(d+(?:[,.]d+)*)
2
See the regex demo. Details:
b
– a word boundary(transfer|rent)
– Group 1: atransfer
orrent
wordD+
– one or more non-digits(d+(?:[,.]d+)*)
– Group 2: one or more digits, and then zero or more occurrences of a comma/period and one or more digits
See the Python demo:
JavaScript
1
6
1
import re
2
s = '19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00n20 Aug transfer from John wick saving a/c 200.00 130.90'
3
rx = r'b(transfer|rent)D+(d+(?:[,.]d+)*)'
4
for m in re.finditer(rx, s):
5
print(f'{m.group(1)} {m.group(2)}')
6
Output:
JavaScript
1
3
1
rent 500.00
2
transfer 200.00
3
For a single term search, you can use
JavaScript
1
8
1
import re
2
s = '19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00n20 Aug transfer from John wick saving a/c 200.00 130.90'
3
w = 'rent'
4
rx = fr'b{w}D+(d+(?:[,.]d+)*)'
5
m = re.search(rx, s)
6
if m:
7
print(f'{w} {m.group(1)}')
8
See this Python demo.