I have time series data and want to see total number of Septic (1) and Non-septic (0) patients in the SepsisLabel column. The Non-septic patients don’t have entries of ‘1’. While the Septic patients have first ‘Zeros (0)’ then it changes to ‘1’ means it now becomes septic. The data looks like this:
Here, there are 3 Septic patients (P_ID = 1, 3, 4) and 2 Non-septic patients (P_ID = 0, 2). I want to plot this number as a bar plot. So, I manually did this using the following code:
import matplotlib.pyplot as plt fig = plt.figure(figsize=(7, 6)) ax = fig.add_axes([0,0,1,1]) sepsis = ['Non-Septic patients', 'Septic patients'] count = [2, 3] ax.bar(sepsis, count) ax.set_title("Septic and Non-septic patient count in the dataset", y = 1, fontsize = 15) ax.set_xlabel('Patients', fontsize = 12) ax.set_ylabel('Count', fontsize = 12) for bars in ax.containers: ax.bar_label(bars) ax.margins(y=0.1) plt.show()
- However, I don’t want to manually calculate the septic and non-septic patient count as the data I have is very large. This is just the dummy data. I know I must use P_ID column but not sure how.
- Second thing I want to plot is Out of these septic and non-septic patients, how many are Male (1) and Female (0) based on the Gender column. I want something like this graph:
drop_duplicates keeps only first row by default. So, the septic patient which has initially
0s then it changes to
1, there arise problem for them. Using the code only take first row even the patient is septic. Thus total number of septic patients drops, while number of non-septic patients increases, which shouldn’t. Is it possible to keep only those rows in septic patients where
0 changes to
1? So, all septic patients will have
1 in SepsisLabel in their first row instead of
0. This will give the correct number of septic patients.
For 1) use
np.where. For 2), you can use
seaborn for the second purpose:
dedup = df.groupby('P_ID')[['SepsisLabel', 'Gender']].max().reset_index() dedup['SepticType'] = np.where(dedup.SepsisLabel, 'Septic', 'NonSeptic') sns.countplot(data=dedup, x='SepticType', hue='Gender')