How to clean survey data in pandas

Question

Input: Output: here's the data: d = {'Morning': ["Didn't answer", "Didn't answer", "Didn't answer", 'Morning', "Didn't answer"], 'Afternoon': ["Didn't answer", 'Afternoon', "Didn't answer", 'Afternoon', "Didn't answer"], 'Night': ["Didn't answer", 'Night', "Didn't answer", 'Night', 'Night'], 'Sporadic': ["Didn't answer", "Didn't answer", 'Sporadic', "Didn't answer", "Didn't answer"], 'Constant': ["Didn't answer", "Didn't answer", "Didn't answer", 'Constant', "Didn't answer"]} I want the output to be:

Accepted Answer

You can use:df["ToD"] = (df.replace("Didn't answer", np.nan).stack().groupby(level=0)               .apply(lambda x: [i for i in x] if len(x) > 1 else x.iloc[0])               .reindex(df.index, fill_value="Didn't answer"))Output:>>> df["ToD"]0                            Didn't answer1                       [Afternoon, Night]2                                 Sporadic3    [Morning, Afternoon, Night, Constant]4                                    NightName: ToD, dtype: object

Advertisement

Answer