Skip to content
Advertisement

Converting dictionary into dataframe

Hello i am trying to convert a dictionary into a dataframe, containing results from a search on amazon (I am using an API.). I would like each product to be a row in the dataframe with the keys as column headers. However there is some keys in the beginning, that i am not interested in having in the table.

Below am i converting the JSON into a dictionary, which i would like to convert it into a dataframe.

JavaScript

Here is part the data in the dictionary (2 out of 25 products total).

JavaScript

The dataframe would look something like below, although with more columns:

JavaScript

I have already tried the answers from this question, but it did not work: Convert Python dict into a dataframe

Advertisement

Answer

  • json_normalize is no longer imported from pandas.io.json. It is now in the top-level namespace.
    • Update your pandas to the current version with pip or conda, depending on your environment.
  • Most of the required information is in the 'search_results' key, but 'search_term' is nested in 'request_parameters', so that key must be set into a list for the meta parameter of pandas.json_normalize
  • The information in the 'prices' column seems to overlap with existing data in other columns.
    • The column has been normalized below, but it could just be dropped, since there is no new information in it.
  • Unneeded columns can be removed with pandas.DataFrame.drop, or use pandas.DataFrame.loc to select only the needed columns, as shown below.
  • As per the timing analysis for this question, df.join(pd.DataFrame(df.pop(col).values.tolist())) is the fastest way to normalize a single level dict from a column and join it back to the main dataframe, but this answer shows how to deal with columns that are problematic (e.g. result in errors when trying .values.tolist()).
JavaScript
Advertisement