I am trying to calculate the number of days that have elapsed since the launch of a marketing campaign. I have one row per date for each marketing campaign in my DataFrame (df) and all dates start from the same day (though there is not a data point for each day for each campaign). In column ‘b’ I have the
Tag: dataframe
Change dates to quarters in JSON file Python
I’m trying to convert the dates inside a JSON file to their respective quarter and year. My JSON file is formatted below: The current code I’m using is an attempt of using the pandas.Series.dt.quarter as seen below: The issue I face is that my code isn’t comprehending the object name “lastDate”. My ideal output should have the dates ultimately replaced
How to use pd.apply() to instantiate new columns?
Instead of doing this: I want to do this in one line or function. Below is what I tried: But I just get Exception has occurred: ValueError. What can I do here? Answer Looks like you can replace your whole code with a reindex: NB. By default the fill value is NaN, if you really want None use fill_value=None. If
What is the best way to combine dataframes that have been created through a for loop?
I am trying to combine dataframes with 2 columns into a single dataframe. The initial dataframes are generated through a for loop and stored in a list. I am having trouble getting the data from the list of dataframes into a single dataframe. Right now when I run my code, it treats each full dataframe as a row. when I
using break function within the function to stop the further execution of program
I am looking to break the execution of further program if the if condition is met. is this the correct way to do so..? since break only ends the loop, but i want to end the function Answer You need to use
Find all possible paths in a python graph data structure without using recursive function
I have a serious issue with finding all possible paths in my csv file that looks like this : Source Target Source_repo Target_repo SOURCE1 Target2 repo-1 repo-2 SOURCE5 Target3 repo-5 repo-3 SOURCE8 Target5 repo-8 repo-5 There a large amount of lines in the datasets, more than 5000 lines. I want to generate all possible paths like this in and return
join two rows itertively to create new table in spark with one row for each two rows in new table
Have a table where I want to go in range of two rows How to I create below table that goes in a range of two and shows the first id with the second col b and message in spark. Final table will look like this. Answer In pyspark you can use Window, example Output:
Pandas lagged rolling average on aggregate data with multiple groups and missing dates
I’d like to calculate a lagged rolling average on a complicated time-series dataset. Consider the toy example as follows: This results in the following DataFrame: Now I’d like to add a column representing the average weight per fruit for the previous 7 days: wgt_per_frt_prev_7d. It should be defined as the sum of all the fruit weights divided by the sum
How would I find the longest string per row in a data frame and print the row number if it exceeds a certain amount
I want to write a program which searches through a data frame and if any of the items in it are above 50 characters long, print the row number and ask if you want to continue through the data frame. I tried using this, but I don’t want to drop the rows, just print the row numbers where the strings
How to convert a 5-level dictionary into a DataFrame?
I have a dictionary with structure: Level 1: id (int) username (str) meta (contain a string of Kpi_info) This is a dictionary: My desire result is a DataFame like this: id username Year Month revenue kpi result 206 hantran 2021 1 2000 2100 0 206 hantran 2021 2 2500 2000 1 206 hantran 2022 1 3000 2500 1 206 hantran