Skip to content
Advertisement

Processing multiple modes in pandas

I’m obviously dealing with slightly more complex and realistic data, but to showcase my trouble, let’s assume we have these data:

JavaScript

I want to find modal values of purchases by date:

JavaScript

agg_mode will show that for user_id 100 we have two modal values: [cookies, jam]. This is totally fine with me, when it comes to real data we’ve come up with a set of rules which mode to pick if there’s a tie. The problem is, to use that heuristic, I need to able to check if the returned set of multiple modal values contains certain values (let’s say, if cookies and jam are returned, we’d always stick to jam only. I can’t find a simple way to process returned multimodal values:

JavaScript

agg_mode_df is a DataFrame, and the purchase column (which now holds the modal values) becomes of object dtype with numpy ndarrays in case of more than one mode for the user_id, and I couldn’t find a working way to convert the modal value(s) of every single user to a list.

Am I overthinking this?

Advertisement

Answer

IIUC, try:

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement