Skip to content
Advertisement

How to remove text and remain with interger values in python dataframe column

I have a dataframe with the column as follows;

ID, Quantity
1   1,000 total
2   802 destroyed
3   >689 total
4   1,234-1,900 lost

I want the output as follows:

ID, Quantity
1    1,000
2    802
3    689
4    1234-1,900

I have tried,

df['Quantity'] = df['Quantity'].str.replace(r' s', '')

No success so far.

Advertisement

Answer

This depends on what possible values can be in the quantity column. If there can never be a space in the numerical part (as in your example) you can use Series.str.partition:

number_column, space_column, text_column = df['Quantity'].str.partition()
del space_column # These two lines are not required but I like to include them
del text_column # to improve code readability and keep pylint happy
df['Quatity'] = number_column

This can also be written in one line:

df['Quantity'] = df['Quantity'].str.partition()[0]
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement