Skip to content
Advertisement

Pandas – expending several values to new columns with some column name manipulation

I’m new to pandas.
Consider you have a state in which you have a pandas Dataframe structure of columns like below:

user_id | timestamp | foo_name1 | foo_name2 | foo_name3

As we can see Dataframe has several metadata parameters, having raw string values: user_id, timestamp
and several dynamic name columns – which have a string value of json within each: foo_name1..foo_name3

Example for the structure of json within foo_name1 col (which has a fixed hierarchy, dict keys and values may vary within):

{"foo_att1": "foo_value1","foo_att2": "foo_value2"}

So my will is to end up with this kind of DF structure instead – kind of expending:
DF:

user_id | timestamp | foo_name1-foo_att1 | foo_name1-foo_att2 | foo_name2-foo_att1 | foo_name2-foo_att2

Whereasfoo_name1-foo_att1 will have as value: "foo_value1"
foo_name1-foo_att2 will have value : "foo_value2" Etc…

How can I achieve this using pandas actions?

Advertisement

Answer

  1. I synthesised the structure you defined
  2. use pd.concat(axis=1) and pd.json_normalize() gets you to your answer
  3. an additional use of a dict comprehension to name columns as per your requirement
JavaScript

output

JavaScript

supplementary – pick columns that are candidates

Better approach to picking columns to be normalised.

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement