I have a Spark dataframe column having array values:
JavaScript
x
5
1
| data | arraydata |
2
| ---- | ---------
3
| text | [0,1,2,3] |
4
| page | [0,1,4,3] |
5
I want to replace [0,1,2,3,4] with [negative,positive,name,sequel,odd]
Advertisement
Answer
JavaScript
1
7
1
mapping = {0: "negative", 1: "positive", 2: "name", 3: "sequel", 4: "odd"}
2
mapping_column = map_from_entries(array(*[struct(lit(k), lit(v)) for k, v in mapping.items()]))
3
4
df = df.withColumn("mapping", mapping_column)
5
.withColumn("arraydatav2", expr(""" transform(arraydata, x -> element_at(mapping, x))"""))
6
.drop("mapping")
7