Tag: group-by

Divide ‘count’ and ‘sum’ inside agg function in pandas

Using groupby and agg, is it possible to, in the same expression, get the ‘count’ divided by ‘sum’ returned? Answer Assuming this is a dummy example (else just compute the mean), yes it is possible to combine aggregators using apply: Better alternative in my opinion: output:

Pandas .transform() results in NaN values after update to newer version

group-by pandas python

I have some code that used to function ~3-4 years ago. I’ve upgraded to newer versions of pandas, numpy, python since then and it has broken. I’ve isolated what I believe is the issue, but don’t quite understand why it occurs. Problem: the last line “dc” is a pandas.Series with only NaN values. It should have no NaN values. Relevant

Pandas: Remove rows in a group if a particular value is also in a group

group-by pandas python

I’m trying to use groupby and agg() function for this data processing step: Input: I plan to aggregate the data by ID. The requirement is if apples and oranges show up for the same ID, keep ‘Apples’; for other combinations, keep the first observation for each ID. So wanted output: I could pivot the table and use np.where; however, in

Python Pandas combinations to build the best team

group-by pandas python python-itertools

I can’t simplify my data so I put them entirely. I would like to build the best possible team of 11 players according to the “niveau” column. Each “id” has a “niveau” note for the “statut” column. I think it would be necessary to test all the possible combinations of “niveau” without there being any “id” duplicates in order to

Panel data: take first observation of each group, repeat row and adjust certain values

group-by pandas python

I have a large Pandas dataframe that looks as follows (85k rows): My goal is the following: For the first observation of each ID for which the BEGDT > Inception, copy the row and change the BEGDT to Inception and the ENDDT to BEGDT – 1 day of the initially copied row. Accordingly, the final output should look as follows:

How to find the most frequent appearence in one column for different values in a different column of a grouped dataframe?

dataframe group-by pandas python

The question is not so clear I guess, so here is an example: given a dataframe: company_name company_size company_acitivity 7 eleven 5 restaurant 7 eleven 5 supermarket 7 eleven 10 supermarket goldman sachs 100 bank goldman sachs 200 restaurant goldman sachs 200 bank I want to group the dataframe by company name and then replace the values in the organization_size

Applying custom function to groupby object keeps groupby column

dataframe group-by pandas python

I have a dataframe which as a column for grouping by and several other columns. Play dataframe: When using a groupby on this dataframe followed by a default function, the groupby column is set as an index and not included in the results: But when I define a custom function and use apply, I get an unwanted additional column: How

GroupBy results to list of dictionaries, Using the grouped by object in it

dataframe group-by pandas python

My DataFrame looks like so: And I’m looking to group it by Date and extract that data to a list of dictionaries so it appears like this: This is my code so far: Using this method can’t use my grouped by objects in the apply method itself: Using to_dict() giving me the option to reach the grouped by object, but

How to change values in a list with respect to other list?

group-by list python

I have 2 lists: So in these 2 lists, For list a there are values repeating in groups of AA,BB and CC for those same repeated value’s index I want to change values in list c. In list c, I want to change values according to group AA’s,BB’s,CC’s index in such a way that whichever value is repeating maximum number

Grouping of a dataframe monthly after calculating the highest daily values

dataframe group-by pandas python r

I’ve got a dataframe with two columns one is datetime dataframe consisting of dates, and another one consists of quantity. It looks like something like this, I want to make another dataframe. It should consist of two columns one is Month/Year and the other is Till Highest. I basically want to calculate the highest quantity value until that month and

Posts navigation

Prev 1 2 3 … 7 Next