Let’s say this the dataframe: Then the goal is to produce this: The total Val1 is Y as long as one of the instances is Y. My code looks like this: This works except that cumulative has dtype object and I can only access Val1, that is, I cannot access First Name or Last Name (Although when I run print(cu…
Tag: dataframe
Find all values that one column’s value has and collect as JSON
class_id class code id 8 XYZ A 1 8 XYZ B 2 9 ABC C 3 I have a dataframe like above. I want to transform it so the ‘codes’ column below collects all the unique (code, id) pairs into a JSON format that a class contains. class_id class codes 8 XYZ [{‘code: ‘A’, ‘id’: 1},…
Concat multiple dataframe and manage those that doesn’t exist
I try to concat some dataframe – 30 dataframe of 24h data – that been created automatically with some csv, but sometimes csv doesn’t exist, so the dataframe wasn’t created (df1, fd2, df4,df8,df9,…). And so I want to create weekly dataframe with 7 concatenated df, but the function…
Lag of values from 1 month ago
My initial dataset has only 2 columns, date and value. What I’m trying to do is, for each date, get the value from the previous month (columns m-1 and m-12). The problems I’m having is when the day doesn’t exist in previous month, like 29 of February, that I want to leave it empty, and most …
Distance Matrix Haversine
I am working on a data frame that looks like this : I’m trying to make a Haverisne distance matrix. Basically for each zone, I would like to calculate the distance between it and all the others in the dataframe. So there should be only 0s on the diagonal. Here is the Haversine function that I use but I …
Selecting first row from each subgroup (pandas)
How to select the subset of rows where distance is lowest, grouping by date and p columns? Ideally, the returned dataframe should contain: Answer One way is to use groupby + idxmin to get the index of the smallest distance per group, then use loc to get the desired output: Output:
Looping through Pandas Dataframe Columns to count values
I have a column with 14000+ rows and there is only two numbers in this column. When I try this I get an equal 12/12 return for each counter. That is definitely not correct. How do I loop through and count the Yes and Nos? Answer You can do it like this :
Create a dataframe containing all weekends in a given year
Good afternoon, I would like to create a function that, given a year, would return a dataframe with all the dates in Timestamp format related to the Saturdays and Sundays of that year. That is to say: The function would return: If you can tell me an optimal way to get that dataframe I would be grateful. Answe…
How do I loop column names in a pandas dataframe?
I am new to Python and have never really used Pandas, so forgive me if this doesn’t make sense. I am trying to create a df based on frontend data I am sending to a flask route. The data is looped through and appended for each row. My only problem is that I don’t know how to get the df
Pandas: add column name to a list, if the column contains a specific set of value
I wish to create a new list which contains column names of those columns which have atleast one of the following values. Most of the time Quite Often Less than often Never Sample : df ={‘A’:[‘name1’, ‘name2’, ‘name3’, ‘name4’], h_ls = [‘B’…