Pandas DataFrame and grouping Pandas Series data into individual columns by value

Question

I am hoping someone can help me optimize the following Python/Pandas code. My code works, but I know there must be a cleaner and faster way to perform the operation under consideration. I am looking for an optimized strategy because my use case will involve 16 unique ADC Types, as opposed to 4 in the example …

Accepted Answer

I liked the idea of using get_dummies, so I modified it a bit:df = (pd.get_dummies(df, 'ADC_TYPE', '_', columns=['ADC_TYPE'])        .replace(1, np.nan)        .apply(lambda x: x.fillna(df['ADC_TYPE']))        .replace(0, np.nan))Output:          RAW  ADC_TYPE_3  ADC_TYPE_7  ADC_TYPE_8  ADC_TYPE_90  4000076160         3.0         NaN         NaN         NaN1     5354368         NaN         7.0         NaN         NaN2     4641792         NaN         NaN         8.0         NaN3  4289860736         NaN         NaN         NaN         9.04  4136386944         3.0         NaN         NaN         NaN5     5440384         NaN         7.0         NaN         NaN6     4772864         NaN         NaN         8.0         NaN7  4289881216         NaN         NaN         NaN         9.0

Advertisement

Answer