I have below 2 pyspark dataframe df1 and df2 :
df1
product 04-01 04-02 04-03 04-05 04-06
cycle 12 24 25 17 39
bike 42 15 4 94 03
bycyle 111 23 12 04 95
df2
04-01 04-02 04-03 04-05 04-06
1 2 3 4 5
I want to multiply df1 each row with the same column of df2 row. Final output be like
result product 04-01 04-02 04-03 04-05 04-06 cycle 12 48 75 68 195 bike 42 30 12 376 15 bycyle 111 46 36 16 475
Advertisement
Answer
You can do a cross join and multiply the columns using a list comprehension:
result = df1.crossJoin(df2).select(
'product',
*[(df1[c]*df2[c]).alias(c) for c in df1.columns[1:]]
)
result.show()
+-------+-----+-----+-----+-----+-----+
|product|04-01|04-02|04-03|04-05|04-06|
+-------+-----+-----+-----+-----+-----+
| cycle| 12| 48| 75| 68| 195|
| bike| 42| 30| 12| 376| 15|
| bycyle| 111| 46| 36| 16| 475|
+-------+-----+-----+-----+-----+-----+