Skip to content

Tag: regex

Remove symbols in dataset

I applied all preprocessing step, but I want to delete the rows that have English words or specific symbols, just i want words in the Arabic language without these symbols or English words that I mention it in below code. I applied the code, but when I print the dataset after cleaning, it still without cleani…

Get only numbers at the end (regex)

I’d like to get only the numbers (integers) at the end of the phrases below: I mean: 600, 1400, 100000. I’ll add each one of them to a database later. I tried to use regex: (?<=s)(d*s*)|(d*.d*)$ But it didn’t work properly. Any ideas? PS: We use dots, not commas to represent a thousand: 1…

Python splitting text with line breaks into a list

I’m trying to convert some text into a list. The text contains special characters, numbers, and line breaks. Ultimately I want to have a list with each word as an item in the list without any special characters, numbers, or spaces. exerpt from text: Currently I’m using this line to split each word…

Regex for AlphaNumeric words with special characters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 1 year ago. Improve this question I am trying to make regex for capturing alphanumeric words with special cha…

Python regex to match many tokens in sequnece

I have a test string that looks like These are my food preferences mango and I also like bananas and I like grapes too. I am trying to write a regex in python to return the text with such rules: Search for the keyword: preferences make a group (words 1:7) until the word ‘like’ >> Repeat this…

Spitting a column based on a delimiter

I would like to extract some information from a column in my dataframe: Example I was using str.contain to extract the first part (i.e., all the information before the first dash, where there is. I am still getting the same original column (so no extraction). My output would consist in two columns, one withou…

How to split with Dot without splitting links [duplicate]

This question already has answers here: How to split by comma and strip white spaces in Python? (10 answers) Closed 1 year ago. I want to split on dot (.) but I don’t want to splits the links. Let’s say the string is – Expected Output – Current Output – Note that I don’t wa…