Skip to content

Tag: regex

Extracting Dialogs from movie scripts using Regex

I would like to extract movie script dialogues like so: UPPERCAPS Character Names Dialog followed up until line-break to avoid snatching in the narration as well. Current Regex: ((s[^w].s[A-Z]+)n+.+) Problem is, it only extracts the character name and the first sentence from the dialog. Here’s the testi…

Treat regular expression between dashes

Could you help me to use “sub” to change the numbers of these expressions: &AFL-03-123456 &AFL-01-12345 &AFL-02-123 context: samsung-j7-duos-dual-chip-desbloqueado-oi-android-5.1-tela-5.5-16gb-wi-fi-4g-camera-13mp-branco&AFL-03-171644black In need to replace the numbers after the s…

String/regex search over Excel in Python issue

I’m a newb to SO, and relatively new to Python, so i’m sorry if this is a simple fix or an inappropriate question. Firstly, my program generally works, but i’m trying to implement some redundancy/catchalls for to make it robust. The program looks over a directory (and sub-dirs) of excel file…

Extract digits from string by condition

I want to extract digits from a short string, base on a condition that the digits is in front of a character (S flag). example and result: I can split the string to a list to get the individual element, but how could I just get the 18 and 10? Answer Use re.findall with the regex r'(d+)S’. This matches a…

Python regex to extract html paragraph

I’m trying to extract parapgraphs from HTML by using the following line of code: but it returns none even though I know there is. Why? Answer Why don’t use an HTML parser to, well, parse HTML. Example using BeautifulSoup: Note that text=True helps to filter out empty paragraphs.