Skip to content
Advertisement

Failed to capture a certain portion of text out of a long text using regex

I’m trying to scrape a portion of text out of a long text using regex.

Original text: If you have any questions or concerns, you may contact us at kaieldentsome [!at] gmail.com. You can also follow us on fb

Portion I’m interested in: kaieldentsome [!at] gmail.com.

It’s not necessary that contact us at will always be present there.

I’ve tried with:

import re

item_str = 'If you have any questions or concerns, you may contact us at kaieldentsome [!at] gmail.com. You can also follow us on fb'
output = re.findall(r"(?<=s).*?s[!at].*?s.*?s",item_str)[0]
print(output)

Output I wish to get:

kaieldentsome [!at] gmail.com.

Advertisement

Answer

You could use

(?<=s)S+s[!at]sS+.S+
  • (?<=s) Positive lookbehind, assert a whitespace char to the left
  • S+ Match 1+ non whitespace chars
  • s[!at]s Match [!at] between whitespace chars
  • S+.S+ Match 1+ non whitespace chars with at least a dot

Note that there has to be a whitespace to the left present. If that is not mandatory, you can omit (?<=s)

Regex demo

Advertisement