how to do multiplication of two pyspark dataframe row wise

Tags: , , ,



I have below 2 pyspark dataframe df1 and df2 :

df1
  product   04-01  04-02  04-03  04-05  04-06
   cycle      12     24     25     17    39
   bike       42     15     4      94    03
   bycyle     111    23     12     04    95 


df2 
   04-01  04-02  04-03  04-05  04-06
     1      2      3     4       5

I want to multiply df1 each row with the same column of df2 row. Final output be like

result
  product   04-01  04-02  04-03  04-05  04-06
   cycle      12     48     75     68    195
   bike       42     30     12     376   15
   bycyle     111    46     36     16    475     

Answer

You can do a cross join and multiply the columns using a list comprehension:

result = df1.crossJoin(df2).select(
    'product', 
    *[(df1[c]*df2[c]).alias(c) for c in df1.columns[1:]]
)

result.show()
+-------+-----+-----+-----+-----+-----+
|product|04-01|04-02|04-03|04-05|04-06|
+-------+-----+-----+-----+-----+-----+
|  cycle|   12|   48|   75|   68|  195|
|   bike|   42|   30|   12|  376|   15|
| bycyle|  111|   46|   36|   16|  475|
+-------+-----+-----+-----+-----+-----+


Source: stackoverflow