Skip to content
Advertisement

How to plot percentage of NaN in pandas data frame?

I’d like someone to help me plot the NaN percentage of pandas data frame. I calculated percentage using this code.

per_1 = df_1.isna().mean().round(4) * 100

It gave me this result.

HR              7.94
O2Sat          10.36
Temp           66.06
SBP            15.20
MAP             9.17
Age             0.00
Gender          0.00
ICULOS          0.00
SepsisLabel     0.00
Patient_iD      0.00

Now, I want to plot the percentage along with the column names of data frame. Can anyone help me?

Regards.


Updated: The graph looks like this. How to beautify this in order to see the column name clearly?

Graph

Also, is it possible to show the percentage on each bar like shown in this below graph?

percentage


Update: The only issue is with HR percentage:

pic

Advertisement

Answer

You can plot a barplot using the following code snippet::

import matplotlib.pyplot as plt

plt.bar(per_1.keys(), per_1.values)
plt.show()

Sample output:

enter image description here

UPDATE:

As per your update to the question, here is a solution that retains only columns having percentage greater than zero. Also the plot has been beautified as requested with values displayed over each bar.

f, ax = plt.subplots()

for i,item in enumerate(zip(per_1.keys(),per_1.values)):
    if (item[1] > 0):
        ax.bar(item[0], item[1], label = item[0])
        ax.text(i - 0.25, item[1] + 1.5 , str(item[1]))

ax.set_xticklabels([]) 
ax.set_xticks([]) 
plt.ylim(0,80)
plt.ylabel('Percentage')
plt.xlabel('Columns')
plt.legend()
plt.show()

Sample Output:

enter image description here

UPDATE 2:

To round the decimals to two decimal places, replace this line in the earlier code:

ax.text(i - 0.25, item[1] + 1.5 , str(np.round(item[1],2)))

You will need to import numpy if not already done: import numpy as np

Advertisement