How can I group by two columns interchangeably? For example, if I have this table and I want to get However, I get this instead when I use The entries (rows) that have the same names but exchanged are considered to be new entries, but i want to treat them the same way, can you please tell me a way
Tag: group-by
How to vectorize groupby and apply in pandas?
I’m trying to calculate (x-x.mean()) / (x.std +0.01) on several columns of a dataframe based on groups. My original dataframe is very large. Although I’ve splitted the original file into several chunks and I’m using multiprocessing to run the script on each chunk of the file, but still every chunk of the dataframe is very large and this process never
Pivot table unable to get same output as excel
I have a dataframe below: I’m trying to find the customer that keeps coming back I did this but it returned me this Edit: I am trying to get the sum of the total amount and the count of the customer code. Any help is appreciated. Thanks! Answer You can use DataFrame.groupby
Group dates into list based on value
I have a json object that I’m trying to group item together in. this code returns values grouped by date as the key and then a list of teams and dates like this However, I need it to return and key value pair like this where all the dates for a specific team are in a list as the value
Pandas rolling sum with groupby and conditions
I have a dataframe with a timeseries of sales of different items with customer analytics. For each item and a given day I want to compute: a share of my best customer in last 2 days total sales a share of my top customers (from a list) in last 2 days total sales I’ve tried solutions provided here: for rolling
Split data frame into multiple data frames based on a group of parameters in a column
I’ve got a data frame like this: DF And I need to split it in multiple data frames by PARAMETER_4 in C column, to get: DF_1 DF_2 DF_3 I cannot find any easy-way function like df.split(axis=0, value=’PARAMETER_4′) Any idea about an approach? Thank you in advance! Answer We can use groupby twice here. First we groupby on column C and
GroupBy columns on column header prefix
I have a dataframe with column names that start with a set list of prefixes. I want to get the sum of the values in the dataframe grouped by columns that start with the same prefix. The only way I could figure out how to do it was to loop through the prefix list, get the columns from the dataframe
How can I pivot a dataframe?
What is pivot? How do I pivot? Long format to wide format? I’ve seen a lot of questions that ask about pivot tables, even if they don’t know it. It is virtually impossible to write a canonical question and answer that encompasses all aspects of pivoting… But I’m going to give it a go. The problem with existing questions and
Iterating through pandas groupby groups
I have a pandas dataframe school_df that looks like this: Each row represents one project by that school. I’d like to add two columns: for each unique school_id, a count of how many projects were posted before that date and a count of how many projects were completed before that date. The code below works, but I have ~300,000 unique
Pandas: Group by calendar-week, then plot grouped barplots for the real datetime
EDIT I found a quite nice solution and posted it below as an answer. The result will look like this: Some example data you can generate for this problem: resulting in: I’d like to group by calendar-week and by value of col1. Like this: resulting in: Then I want a plot to be generated like this: That means: calendar-week and