Skip to content

Tag: aggregate

Aggregate data with two conditions

I have a data frame that looks something like this: What I would like to do is aggregate the data if the dates are the same – but only if the name is different. So the above data frame should actually become: Currently I am almost doing it with: However, this will also aggregate the ones where the name …

How to sort aggregated numpy array?

My first post on stackoverflow + am very new to programming. Apologies in advance for any poor formatting and missing information. :) I aggregated two columns in a csv file (one column of seller names, the other of transactional amounts) to find how much each seller has made in total: I want to sort it in des…

PySpark Dataframe melt columns into rows

As the subject describes, I have a PySpark Dataframe that I need to melt three columns into rows. Each column essentially represents a single fact in a category. The ultimate goal is to aggregate the data into a single total per category. There are tens of millions of rows in this dataframe, so I need a way t…