I have a data frame Need to calculate Z-score for columns “c1”, “c2”, “c3” using groupby on “id”, and transform it to the original form without using the loop. Expected output: How to do it? Answer Use GroupBy.transform with DataFrame.join:
Tag: pandas
Find all ids that have 2 specific values for a one column
I have a dataframe that contains data of employees, their managers and the projects they worked on. The dataframe (a bit simplified) looks like this: I would like get all employees that have both worked with manager 17 and 18, in this case that would be employee 2 and employee 6. I know I can write a query to…
Copy the last seen non empty value of a column based on a condition in most efficient way in Pandas/Python
I need to copy and paste the previous non-empty value of a column based on a condition. I need to do it in the most efficient way because the number of rows is a couple of millions. Using for loop will be computationally costly. So it will be highly appreciated if somebody can help me in this regard. Based on
How do you switch the colors of a bar chart in python matplotlib?
I’m trying to switch the colors of my bar charts so that they’re consistent throughout. In the plots below, I want to make it so JP_Sales is orange in both charts and NA_Sales is blue in both charts. The code for the first chart is: The code for the second chart is: Answer plot() has a color argum…
ffill col[c] based on col[a]==Value
I have a dataframe [pixel, total_time], i want to: Make a new column “total_time_one”, which takes total_time of pixel 1 and projects it I have acheved the above dataframe with : Howver the code is quite long and repeats itself, is there a function better suited? or a better solution? Also i do no…
filter dates using pandas from dataframe
I have a column of dates. I need to filter out those dates that fall between today’s date and end of the current month. If the dates fall between these dates then the next column showns “Y” Date Column 01/02/2021 03/02/2021 31/03/2021 Y 01/03/2021 07/03/2021 Y 08/03/2021 Y Since today’…
Is there any way to show mean in box plot using Python?
I’m just starting using Matplotlib, and I’m trying to learn how to draw a box plot in Python using Colab. My problem is: I’m not able to put the median on the graph. The graph just showed the quartiles, mean, and outliers. Can someone help me? My code is the following. Answer I tried running…
Can pandas perform an aggregating operation involving two columns?
Given the following dataframe, is it possible to calculate the sum of col2 and the sum of col2 + col3, in a single aggregating function? . col1 col2 col3 0 a 1 10 1 a 2 20 2 b 3 30 3 b 4 40 In R’s dplyr I would do it with a single line of summarize, and I
Pandas dataframe diff between rows with column offset
I have a Dataframe with the following structure time_start time_end label time time + 1 action time + 1 time + 2 some_other_action I would like to take see the diff of time_start and previous row time_end. in this case (time + 1) – (time + 1) = 0 I have tried df.diff, but that only yields the diff withi…
How to convert object to float in Pandas?
I read a csv file into a pandas dataframe and got all column types as objects. I need to convert the second and third columns to float. I tried using but got NaN. Here’s my dataframe. Should I need to use some regex in the third column to get rid of the “R$ “? Answer Try this: Output: