I’m trying to scrape a portion of text out of a long text using regex.
Original text: If you have any questions or concerns, you may contact us at kaieldentsome [!at] gmail.com. You can also follow us on fb
Portion I’m interested in: kaieldentsome [!at] gmail.com.
It’s not necessary that contact us at will always be present there.
I’ve tried with:
import re item_str = 'If you have any questions or concerns, you may contact us at kaieldentsome [!at] gmail.com. You can also follow us on fb' output = re.findall(r"(?<=s).*?s[!at].*?s.*?s",item_str)[0] print(output)
Output I wish to get:
kaieldentsome [!at] gmail.com.
Advertisement
Answer
You could use
(?<=s)S+s[!at]sS+.S+
- (?<=s)Positive lookbehind, assert a whitespace char to the left
- S+Match 1+ non whitespace chars
- s[!at]sMatch- [!at]between whitespace chars
- S+.S+Match 1+ non whitespace chars with at least a dot
Note that there has to be a whitespace to the left present. If that is not mandatory, you can omit (?<=s)