Here’s an example of my working code that accomplishes what I want but is much too long. The resulting data frame is in the format I want here:
cc_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'Low'))] cc_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'Medium'))] cc_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'High'))] ms_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'Low'))] ms_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'Medium'))] ms_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'High'))] sc_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'Low'))] sc_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'Medium'))] sc_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'High'))] ns_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'Low'))] ns_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'Medium'))] ns_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'High'))] cc_low_val = cc_idx_low['Encounter CSN'].nunique() cc_med_val = cc_idx_med['Encounter CSN'].nunique() cc_high_val = cc_idx_high['Encounter CSN'].nunique() ms_low_val = ms_idx_low['Encounter CSN'].nunique() ms_med_val = ms_idx_med['Encounter CSN'].nunique() ms_high_val = ms_idx_high['Encounter CSN'].nunique() sc_low_val = sc_idx_low['Encounter CSN'].nunique() sc_med_val = sc_idx_med['Encounter CSN'].nunique() sc_high_val = sc_idx_high['Encounter CSN'].nunique() ns_low_val = ns_idx_low['Encounter CSN'].nunique() ns_med_val = ns_idx_med['Encounter CSN'].nunique() ns_high_val = ns_idx_high['Encounter CSN'].nunique() # building the specialty range list using the values from above df_encounter_counts=[['Critical Care',cc_low_val,cc_med_val,cc_high_val], ['Medical Surgical',ms_low_val,ms_med_val,ms_high_val], ['Specialty Care',sc_low_val,sc_med_val,sc_high_val], ['No Specialty',ns_low_val,ns_med_val,ns_high_val]] # creating the ranges data frame for the list of values df_encounter_counts = pd.DataFrame(df_encounter_counts,columns=['Specialty','Low','Medium','High']) df_encounter_counts
I know that I can accomplish close to what I want with a groupby() on the ‘Specialty Type’ and the ‘Foundation Threshold’ using an .nunique() on the ‘Encounter CSN’ (count of unique encounters per specialty type and threshold:
Example Data: Three columns of values
df_encounters = df[['Encounter CSN','Specialty Type','Foundation Threshold']].copy() df_enc_totals = df_encounters.groupby(['Specialty Type','Foundation Threshold'])['Encounter CSN'].nunique() Specialty Type Foundation Threshold Critical Care High 271 Low 165 Medium 439 Medical Surgical High 245 Low 2307 Medium 2724 No Specialty High 10 Low 1277 Medium 472 Specialty Care High 214 Low 906 Medium 1205 Name: Encounter CSN, dtype: int64
I’d like to sort the “Low”, “Medium”, and “High” under their own columns with the count of unique encounters underneath like so:
Specialty Type Low Medium High Critical Care 165 439 271 Medical Surgical 2307 2724 245 Specialty Care 906 1205 214 No Specialty 1277 472 10
Edit: Unstacked “Foundation Threshold” Error trying to format
Final Answer (combination of marked answer and mine):
df_encounters = df[['Encounter CSN','Specialty Type','Enloe Threshold']].copy() df_enc_totals = df_encounters.groupby(['Specialty Type','Enloe Threshold'])['Encounter CSN'].nunique() df_enc_count = df_enc_totals.unstack('Enloe Threshold') df_enc_count['Specialty'] = df_enc_count.index df_enc_count.columns.name = '' df_enc_count.index = range(0, 4, 1) df_enc_count[['Specialty','Low','Medium','High']]```
Advertisement
Answer
Here is your dataframe:
df_enc_totals = pd.DataFrame({'': {('Critical Care', 'High'): 271, ('Critical Care', 'Low'): 165, ('Critical Care', 'Medium'): 439, ('Medical Surgical', 'High'): 245, ('Medical Surgical', 'Low'): 2307, ('Medical Surgical', 'Medium'): 2724, ('No Specialty', 'High'): 10, ('No Specialty', 'Low'): 1277, ('No Specialty', 'Medium'): 472, ('Specialty Care', 'High'): 214, ('Specialty Care', 'Low'): 906, ('Specialty Care', 'Medium'): 1205}} ) df_enc_totals.index.names = ['Specialty Type', 'Foundation Threshold'] df_enc_totals
Please try this:
df_enc_totals = df_enc_totals.unstack('Foundation Threshold') df_enc_totals.columns = df_enc_totals.columns.droplevel() df_enc_totals = df_enc_totals[['Low', 'Medium','High']] df_enc_totals.columns.name = '' df_enc_totals.reset_index(inplace = True) df_enc_totals