Skip to content
Advertisement

Pandas split list upon DataFrame creation

I have a JSON file coming in, which I am doing some operations/trimming on.

The result looks like this:

print("User:", user)
> User: {'id': 1, 'label': 'female', 'position': {'lat': 47.72485566, 'lon': 10.32219439}, 'confidence': 0.8}

When applying df = pd.DataFrame(user, index=[0]) I get the following Dataframe:

     id   label    position  velocity
0    1    female   NaN       0.8

When applying df = pd.DataFrame(user) I get:

      id   label    position     confidence
lat   1    female   47.72485566  0.8
lon   1    female   10.32219439  0.8

I am aware, as to why that happens, however none is what I want.

I’d like the following:

     id   label    lat          lon           confidence
0    1    female   47.72485566  10.32219439   0.8

However I am not sure what the best way is to split the position parameter.

Advertisement

Answer

You can just pandas.json_normalize , then later rename the columns:

>>> df = pd.json_normalize({'id': 1, 'label': 'female', 'position': {'lat': 47.72485566, 'lon': 10.32219439}, 'confidence': 0.8})
>>> df = df.rename(columns={'position.lat': 'lattitude', 'position.lon': 'longitude'})

OUTPUT

id   label  confidence  lattitude  longitude
0   1  female         0.8  47.724856  10.322194
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement