I’m trying to create a excel with value counts and percentage, I’m almost finishing but when I run my for loop, the percentage is added like a new df.to_frame with two more columns but I only want one this is how it looks in excel:
I want that the blue square not appears in the excel or the df and the music percentage is next to the counts of music column, also the music percentage I would like to put it with percentage format instead 0.81 –> 81%. Below is my code.
li = [] for i in range(0, len(df.columns)): value_counts = df.iloc[:, i].value_counts().to_frame().reset_index() value_percentage = df.iloc[:, i].value_counts(normalize=True).to_frame().reset_index()#.style.format('{:.2%}') li.append(value_counts) li.append(value_percentage) data = pd.concat(li, axis=1)
Advertisement
Answer
The .reset_index()
function creates a column in your dataframe called index. So you are appending two-column dataframes each time, one of which is the index. You could add .drop(columns='index')
after .reset_index()
to drop the index column at each step and therefore also in your final dataframe.
However, depending on your application you may want to be careful with resetting the index because it looks like you are appending in a way where your rows do not align (i.e. not all your index columns are not all the same).
To change your dataframe values to strings with percentages you can use:
value_counts = (value_counts*100).astype(str)+'%'