I applied all preprocessing step, but I want to delete the rows that have English words or specific symbols, just i want words in the Arabic language without these symbols or English words that I mention it in below code. I applied the code, but when I print the dataset after cleaning, it still without cleaning! i want to remove
Tag: regex
Get only numbers at the end (regex)
I’d like to get only the numbers (integers) at the end of the phrases below: I mean: 600, 1400, 100000. I’ll add each one of them to a database later. I tried to use regex: (?<=s)(d*s*)|(d*.d*)$ But it didn’t work properly. Any ideas? PS: We use dots, not commas to represent a thousand: 1.000, instead of 1,000. Answer In the
Python splitting text with line breaks into a list
I’m trying to convert some text into a list. The text contains special characters, numbers, and line breaks. Ultimately I want to have a list with each word as an item in the list without any special characters, numbers, or spaces. exerpt from text: Currently I’m using this line to split each word into an item in the list: This
Regex – match until a group of multiple possibilities
I have the following text: You may have that thing NO you dont BUT maybe yes I’m trying to write a regex which can match everything until it finds some specific words, “NO” and “BUT” in this example, and if the string has both of the words, then stop at the first one: You may have that thing NO you
Regex for AlphaNumeric words with special characters [closed]
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 1 year ago. Improve this question I am trying to make regex for capturing alphanumeric words with special characters. The search will be done on small
regex substitute every appearance of a capture group with another capture group
I am reformatting a large set of sales data. Each sale shows the name of the item, number of items being sold, and the price rounded to the nearest whole number. 1 bag of 20 Apples sold for $3: Apple/,20,3, If more than one sale occurs, the sales data replaces the item name for every result after the first one.
Segregate a column data based on regex using pandas
I have a dataframe like as shown below I would like to create 3 new columns val_num – will store ONLY NUMBER values that comes along with symbols ex: 1234 (from >1234) and 1000 (from <1000) but WILL NOT STORE 31 (from 31sadj) because it doesn’t have any symbol val_str – will store only values a mix of NUMBER,symbols,ALPHABETS or
Python regex to match many tokens in sequnece
I have a test string that looks like These are my food preferences mango and I also like bananas and I like grapes too. I am trying to write a regex in python to return the text with such rules: Search for the keyword: preferences make a group (words 1:7) until the word ‘like’ >> Repeat this step as much
Spitting a column based on a delimiter
I would like to extract some information from a column in my dataframe: Example I was using str.contain to extract the first part (i.e., all the information before the first dash, where there is. I am still getting the same original column (so no extraction). My output would consist in two columns, one without points information (Col1) and another one
How to split with Dot without splitting links [duplicate]
This question already has answers here: How to split by comma and strip white spaces in Python? (10 answers) Closed 1 year ago. I want to split on dot (.) but I don’t want to splits the links. Let’s say the string is – Expected Output – Current Output – Note that I don’t want the link to split. Also,