I am trying to calculate the number of days that have elapsed since the launch of a marketing campaign. I have one row per date for each marketing campaign in my DataFrame (df) and all dates start from the same day (though there is not a data point for each day for each campaign). In column ‘b’ I …
Tag: dataframe
Change dates to quarters in JSON file Python
I’m trying to convert the dates inside a JSON file to their respective quarter and year. My JSON file is formatted below: The current code I’m using is an attempt of using the pandas.Series.dt.quarter as seen below: The issue I face is that my code isn’t comprehending the object name “…
How to use pd.apply() to instantiate new columns?
Instead of doing this: I want to do this in one line or function. Below is what I tried: But I just get Exception has occurred: ValueError. What can I do here? Answer Looks like you can replace your whole code with a reindex: NB. By default the fill value is NaN, if you really want None use fill_value=None. I…
What is the best way to combine dataframes that have been created through a for loop?
I am trying to combine dataframes with 2 columns into a single dataframe. The initial dataframes are generated through a for loop and stored in a list. I am having trouble getting the data from the list of dataframes into a single dataframe. Right now when I run my code, it treats each full dataframe as a row…
using break function within the function to stop the further execution of program
I am looking to break the execution of further program if the if condition is met. is this the correct way to do so..? since break only ends the loop, but i want to end the function Answer You need to use
Find all possible paths in a python graph data structure without using recursive function
I have a serious issue with finding all possible paths in my csv file that looks like this : Source Target Source_repo Target_repo SOURCE1 Target2 repo-1 repo-2 SOURCE5 Target3 repo-5 repo-3 SOURCE8 Target5 repo-8 repo-5 There a large amount of lines in the datasets, more than 5000 lines. I want to generate a…
join two rows itertively to create new table in spark with one row for each two rows in new table
Have a table where I want to go in range of two rows How to I create below table that goes in a range of two and shows the first id with the second col b and message in spark. Final table will look like this. Answer In pyspark you can use Window, example Output:
Pandas lagged rolling average on aggregate data with multiple groups and missing dates
I’d like to calculate a lagged rolling average on a complicated time-series dataset. Consider the toy example as follows: This results in the following DataFrame: Now I’d like to add a column representing the average weight per fruit for the previous 7 days: wgt_per_frt_prev_7d. It should be defin…
How would I find the longest string per row in a data frame and print the row number if it exceeds a certain amount
I want to write a program which searches through a data frame and if any of the items in it are above 50 characters long, print the row number and ask if you want to continue through the data frame. I tried using this, but I don’t want to drop the rows, just print the row numbers where the strings
How to convert a 5-level dictionary into a DataFrame?
I have a dictionary with structure: Level 1: id (int) username (str) meta (contain a string of Kpi_info) This is a dictionary: My desire result is a DataFame like this: id username Year Month revenue kpi result 206 hantran 2021 1 2000 2100 0 206 hantran 2021 2 2500 2000 1 206 hantran 2022 1 3000 2500 1 206 ha…