i have a columns of dataframe with 1000+ different format of values. how can i format these in order to have unified view like this 1.000.000.
Sales 1.000.000 10000000 150,250 0,200655
for example:
- row 1 is desiderated view
- row 2 10000000 should be 10.000.000
- row 3 150,250 should be 150.250
- row 4 0,200655 should be 200.655
Advertisement
Answer
For your input this should work:
df['Sales'] = ["{:,}".format(int(num.replace('.', '').replace(',', ''))).replace(',', '.') for num in df['Sales']]
Here we:
- get each element in Sales column and remove ‘,’ and ‘.’ ie ‘150,250’ -> ‘150250’
- Then covert the string into
int
ie ‘150250’ -> 150250 - then use
"{:,}".format()
to formatint
to string with commas ie 150250 -> ‘150,250’ - In this converted string we replace commas ‘,’ with period ‘.’ ie ‘150,250’ -> ‘150.250’
- make an array with the results and assign it to Sales column of Dataframe
Input:
Sales 0 1.000.000 1 10000000 2 150,250 3 0,200655
Output:
Sales 0 1.000.000 1 10.000.000 2 150.250 3 200.655