I have a dataframe which looks like this
JavaScript
x
4
1
ID col
2
1 [item1 -> 0.2, Item2 -> 0.3, item3 -> 0.4]
3
2 [item2 -> 0.1, Item2 -> 0.7, item3 -> 0.2]
4
I want to sum of all the row wise decimal values and store into a new column
JavaScript
1
4
1
ID col total
2
1 [item1 -> 0.2, Item2 -> 0.3, item3 -> 0.4] 0.9
3
2 [item2 -> 0.1, Item2 -> 0.7, item3 -> 0.2] 1.0
4
My approach
JavaScript
1
2
1
df = df.withColumn('total', F.expr('aggregate(map_values(col),0,(acc,x) -> acc + x)'))
2
This is not working as it says, it can be applied only to int
Advertisement
Answer
JavaScript
1
4
1
data_sdf.
2
withColumn('map_vals', func.map_values('col')).
3
withColumn('sum_of_vals', func.expr('aggregate(map_vals, cast(0 as double), (x, y) -> x + y)'))
4
Since, your values are of float
type, the initial value passed within the aggregate
should match the type of the values in the array. So, casting the initial 0
to double
instead of using 0
should work fine.