Skip to content
Advertisement

How to get a date_range and insert them as a ‘list’ to a new column in dataframe?

I have a dataframe with 50k+ rows. df.head(5) is below:

JavaScript

I need to create one more column with a list of months spent between start and finish dates to use df.explode relying on this column and get for every ID with months_used > 1 new row with date of every month the work was in progress.

My primitive way is:

JavaScript

But it does not make any sense, because it gives me strings like this "DatetimeIndex(['2019-03-31', '2019-04-30', '2019-05-31', '2019-06-30'], dtype='datetime64[ns]', freq='M')" and explode doesnt work. If i remove str() i get a ValueError: Must have equal len keys and value when setting with an iterable. And it is very slow…

I even cant fill any column with list-type because i get the same error (i suppose it tries to fill a column with values from the list instead of inserting the list itself to the dataframe…)

expected:

JavaScript

Inside ‘expected’-‘rep_list’ there could be any dates from current month… e.g. [2019-03-01, 2019-04-01, 2019-05-01, 2019-06-01, 2019-07-01] or others. Just need explode to work after that.

Advertisement

Answer

Try:

JavaScript

OR

JavaScript

Now if you print df you will get your desired output

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement