Spitting a column based on a delimiter

Question

I would like to extract some information from a column in my dataframe: Example I was using str.contain to extract the first part (i.e., all the information before the first dash, where there is. I am still getting the same original column (so no extraction). My output would consist in two columns, one without points information (Col1) and another one

Accepted Answer

Try using str.extract with Regex.Ex:import pandas as pddf[['Col1', 'Col2']] = df['Col'].str.extract(r"(d+ points?)?s*—?s*(.*)", expand=True)print(df)Output:                                Col       Col1                  Col20  7 points  — it is an example ...   7 points  it is an example ...1         13 points  — as above ...  13 points          as above ...2               some other text ...        NaN   some other text ...3    1 point  — "what to say more?"    1 point   "what to say more?"4                  13 points  — ...  13 points                   ...5             11 points  — 1234 ...  11 points              1234 ...

Advertisement

Answer