Skip to content
Advertisement

pandas split values in column

I’m new to pandas (version 1.1.5) and have tried str.split() and str.extract() to split column POS of numerical values with no success. My dataframe is about 3000 lines and is structured like this (note _ and - delimiters in subset):

JavaScript

I would like for the dataframe to look like this (i.e. retain values preceding all delimiters):

JavaScript

My attempts have either split the rows only containing a delimiter and dropping all other rows, dropping all rows containing just the delimiters, or dropping all values.

For example, df['POS'] = df['POS'].str.replace(r'[-|_]d+', '') outputs:

JavaScript

Accepting the solution from @PaulS below as I needed to convert the column datatype from object to string first in order for str.replace() to work!

JavaScript

Advertisement

Answer

A possible solution, based on the idea of replacing all characters after _ or - (inclusive) with the empty string (''):

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement