Skip to content
Advertisement

Create multiple new rows per row in data frame

I have the following df:

    ID              data
0   123123          1
1   123412          1
2   123123          1
3   234234          2
4   432424          2

What I want to do know is add x new rows per row based on the id. So more specific spoken, I want to add a new column containing the date from a range of 7 days and then add a new row with the date for every ID in the df.

so the output should look sth like this if we say I want only dates in the range of 2 days:

    ID              data   date
0   123123          1      2020-02-01
0   123123          1      2020-02-02
1   123412          1      2020-02-01
1   123412          1      2020-02-02
2   123123          1      2020-02-01
2   123123          1      2020-02-02
3   234234          2      2020-02-01
3   234234          2      2020-02-02
4   432424          2      2020-02-01
4   432424          2      2020-02-02

I thought of doing is with loops, however this idea seems really slow and not efficient.

df = pd.DataFrame(columns=['ID','data','date'])
for date in pd.date_range(start=dateRange[0],end=dateRange[1]):
    for row in df_old:
        df = df.append({'ID': row['ID'], 'data': row['data'], 'date': date}, ignore_index=True)

I am really thankful for any better idea

Advertisement

Answer

Let us try assign with melt

out = df.assign(**{'date1':'2020-02-01','date2':'2020-02-02'}).melt(['ID','data']).sort_values('ID')
       ID  data variable       value
0  123123     1    date1  2020-02-01
2  123123     1    date1  2020-02-01
5  123123     1    date2  2020-02-02
7  123123     1    date2  2020-02-02
1  123412     1    date1  2020-02-01
6  123412     1    date2  2020-02-02
3  234234     2    date1  2020-02-01
8  234234     2    date2  2020-02-02
4  432424     2    date1  2020-02-01
9  432424     2    date2  2020-02-02
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement