Skip to content
Advertisement

Extract first fields from struct columns into a dictionary

I need to create a dictionary from Spark dataframe’s schema of type pyspark.sql.types.StructType.

The code needs to go through entire StructType, find only those StructField elements which are of type StructType and, when extracting into dictionary, use the name of parent StructField as key while value would be name of only the first nested/child StructField.

Example schema (StructType):

JavaScript

Desired result:

JavaScript

Advertisement

Answer

You can use a dictionary comprehension navigating through the schema.

JavaScript

Test #1

JavaScript

Test #2

JavaScript
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement