I have multiple strings such as
POST /incentivize HTTP/1.1 DELETE /interactive/transparent/niches/revolutionize HTTP/1.1 DELETE /virtual/solutions/target/web+services HTTP/2.0 PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1
I want to target all these strings with regex
.
I tried the following pattern
pattern = r"([A-Z]* /([A-Za-z0-9])D+ [A-Z]*/d.d)"
Here is the full code
string = """ POST /incentivize HTTP/1.1 DELETE /interactive/transparent/niches/revolutionize HTTP/1.1 DELETE /virtual/solutions/target/web+services HTTP/2.0 PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1 """ pattern = r"(?P<url>[A-Z]* /([A-Za-z0-9])D+ [A-Z]*/d.d)" result = [item.groupdict() for item in re.finditer(pattern,string)] result
This outputs the following
[{'url': 'POST /incentivize HTTP/1.1'}, {'url': 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1'}, {'url': 'DELETE /virtual/solutions/target/web+services HTTP/2.0'}]
With this pattern, I am able to target the first three strings. But for the life of me, I am not able to figure out how to target the last string. This is just a sample of many more strings in the list. I need to make this dynamic so that the program is able to capture strings that are similar to this.
I am a rookie in python and have just started learning regex
.
Any help will be appreciated.
Advertisement
Answer
I would use re.findall
here with the following regex pattern:
b(?:POST|GET|PUT|PATCH|DELETE)b /[^/s]+(?:/[^/s]+)* HTTP/d+(?:.d+)?
Script:
string = """ POST /incentivize HTTP/1.1 DELETE /interactive/transparent/niches/revolutionize HTTP/1.1 DELETE /virtual/solutions/target/web+services HTTP/2.0 PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1 """ matches = re.findall(r'b(?:POST|GET|PUT|PATCH|DELETE)b /[^/s]+(?:/[^/s]+)* HTTP/d+(?:.d+)?', string) print(matches)
This prints:
['POST /incentivize HTTP/1.1', 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1', 'DELETE /virtual/solutions/target/web+services HTTP/2.0', 'PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1']
The regex pattern works by matching one of several HTTP methods in an alternation, to which you may add more methods if necessary. Then, it matches a path, followed by HTTP
and a version number.