I want to turn this:
age id val 0 99 1 0.3 1 99 2 0.5 2 99 3 0.1
Into this:
age id val 0 25 1 0.3 1 50 1 0.3 2 75 1 0.3 3 25 2 0.5 4 50 2 0.5 5 75 2 0.5 6 25 3 0.1 7 50 3 0.1 8 75 3 0.1
Context: I have data stored with one value coded for all ages (age = 99). However, the application I am developing for needs the value explicitly stated for every id-age pair (id =1, age = 25,50, and 75). There are simple solutions to this: iterate over id’s and append a bunch of dataframes, but I’m looking for something elegant. I’d like to do a many:one merge from my original dataframe to a template containing all the ages, but I would still have to loop over id’s to create the template.
Advertisement
Answer
Don’t know, may be there’s more elegant approach, but you can do something like cross join (or cartesian product):
>>> df = pd.DataFrame({'age':[99,99,99],'id':[1,2,3],'val':[0.3,0.5,0.1]}) >>> df age id val 0 99 1 0.3 1 99 2 0.5 2 99 3 0.1 >>> df2 = pd.DataFrame({'age':[99,99,99],'new_age':[25,50,75]}) >>> df2 = pd.merge(df, df2, on='age') >>> del df2['age'] >>> df2 = df2.rename(columns={'new_age':'age'}) >>> df2 id val age 0 1 0.3 25 1 1 0.3 50 2 1 0.3 75 3 2 0.5 25 4 2 0.5 50 5 2 0.5 75 6 3 0.1 25 7 3 0.1 50 8 3 0.1 75