Skip to content
Advertisement

Tag: regex

Extracting Dialogs from movie scripts using Regex

I would like to extract movie script dialogues like so: UPPERCAPS Character Names Dialog followed up until line-break to avoid snatching in the narration as well. Current Regex: ((s[^w].s[A-Z]+)n+.+) Problem is, it only extracts the character name and the first sentence from the dialog. Here’s the testing data: EDIT New Regex: (w[A-Z]+ns).+?(?=n) Answer You can use the following the regex:

Treat regular expression between dashes

Could you help me to use “sub” to change the numbers of these expressions: &AFL-03-123456 &AFL-01-12345 &AFL-02-123 context: samsung-j7-duos-dual-chip-desbloqueado-oi-android-5.1-tela-5.5-16gb-wi-fi-4g-camera-13mp-branco&AFL-03-171644black In need to replace the numbers after the second dash for other numbers (let’s say 987654). The number after the second dash, as you can see in the examples, may vary in number of digits but they are always numbers. The

String/regex search over Excel in Python issue

I’m a newb to SO, and relatively new to Python, so i’m sorry if this is a simple fix or an inappropriate question. Firstly, my program generally works, but i’m trying to implement some redundancy/catchalls for to make it robust. The program looks over a directory (and sub-dirs) of excel files, opens them individually, scours for data (on a specific

Extract digits from string by condition

I want to extract digits from a short string, base on a condition that the digits is in front of a character (S flag). example and result: I can split the string to a list to get the individual element, but how could I just get the 18 and 10? Answer Use re.findall with the regex r'(d+)S’. This matches all

Python regex to extract html paragraph

I’m trying to extract parapgraphs from HTML by using the following line of code: but it returns none even though I know there is. Why? Answer Why don’t use an HTML parser to, well, parse HTML. Example using BeautifulSoup: Note that text=True helps to filter out empty paragraphs.

Advertisement