I would like to ask you how I can extract substrings related to some keywords.
For example I have the following text:
mystring = "Commission 0,0000 Packaging 0,0426 Discount 0,0120 Transport 0,0690 F YEB 0,0000 Commission 0,0000 Payment discount 0,0000 % Other discount 0,0000 YEB 4,0700 % Industrial 0,3856"
I would like to extract the numeric value after some keywords, for example: “Discount” and “Other discount”. I was trying with the following code:
test = re.compile(r"""( (Discountsd*) (Othersdiscountsd*) )""", re.VERBOSE) pr = test.findall(mystring)
I would like to obtain (in this case) a pair –> Discount : 0,0120 and Other discount : 0,0000 But it could be also enough obtain a list like the following one:
["Discount 0,0120", "Other discount 0,0000"]
I really thanks in advance for any help.
Advertisement
Answer
I had better luck with re.search. Also you were missing d,d to capture numbers before and after the comma.
import re mystring = "Commission 0,0000 Packaging 0,0426 Discount 0,0120 Transport 0,0690 F YEB 0,0000 Commission 0,0000 Payment discount 0,0000 % Other discount 0,0000 YEB 4,0700 % Industrial 0,3856" pattern = "(Discountsd+,d+)(.*)(Othersdiscountsd+,d+)" p = re.search(pattern, mystring) p.groups() >> ('Discount 0,0120', ' Transport 0,0690 F YEB 0,0000 Commission 0,0000 Payment discount 0,0000 % ', 'Other discount 0,0000')