I have below 2 pyspark dataframe df1 and df2 :
df1 product 04-01 04-02 04-03 04-05 04-06 cycle 12 24 25 17 39 bike 42 15 4 94 03 bycyle 111 23 12 04 95 df2 04-01 04-02 04-03 04-05 04-06 1 2 3 4 5
I want to multiply df1 each row with the same column of df2 row. Final output be like
result product 04-01 04-02 04-03 04-05 04-06 cycle 12 48 75 68 195 bike 42 30 12 376 15 bycyle 111 46 36 16 475
Advertisement
Answer
You can do a cross join and multiply the columns using a list comprehension:
result = df1.crossJoin(df2).select( 'product', *[(df1[c]*df2[c]).alias(c) for c in df1.columns[1:]] ) result.show() +-------+-----+-----+-----+-----+-----+ |product|04-01|04-02|04-03|04-05|04-06| +-------+-----+-----+-----+-----+-----+ | cycle| 12| 48| 75| 68| 195| | bike| 42| 30| 12| 376| 15| | bycyle| 111| 46| 36| 16| 475| +-------+-----+-----+-----+-----+-----+