I have a dataframe with a list of items in the first row and then all the items that were bought with that item in subsequent columns:
JavaScript
x
4
1
df = pd.DataFrame({'1': ['Item 1', 'Item 1', 'Item 1', 'Item 2', 'Item 2', 'Item 2'],
2
'2': ['Item 4', 'Item 5', 'Item 6', 'Item 7', 'Item 8', 'Item 9'],
3
'3': ['Item 10', 'Item 11', 'Item 12', 'Item 13', 'Item 14', 'Item 15']})
4
I want to merge all the items bought with each item into a single row as below:
JavaScript
1
8
1
new_df = pd.DataFrame({'1': ['Item 1', 'Item 2'],
2
'2': ['Item 4', 'Item 7'],
3
'3': ['Item 10', 'Item 13'],
4
'4': ['Item 5', 'Item 8'],
5
'5': ['Item 11', 'Item 14'],
6
'6': ['Item 6', 'Item 9'],
7
'7': ['Item 12', 'Item 15']})
8
So, all the items bought with Item 1 form the columns next to it. As you can see in my example I want to keep all items that were bought with each item, even if they are duplicated.
I have been trying to get it to work with a pandas dataframe, however if there was a list generated for each item that would also be fine. I have been trying some kind of groupby and lambda function but I can’t get them to work.
EDIT: Changed numbers to make it more clear how the final df should be organized.
Thanks!
Advertisement
Answer
TRY:
JavaScript
1
3
1
new_df = df.groupby('1', as_index=False).apply(
2
lambda x: pd.Series(x.values.ravel()[1:]))
3
OUTPUT:
JavaScript
1
4
1
1 0 1 2 3 4 5 6 7
2
0 Item 1 Item 4 Item 10 Item 1 Item 5 Item 11 Item 1 Item 6 Item 12
3
1 Item 2 Item 7 Item 13 Item 2 Item 8 Item 14 Item 2 Item 9 Item 15
4