How to convert json dataframe to normal dataframe?

Question

I have a dataframe which has lots of json datas inside. for example : There are two types of data.Strain sensor and acceleration sensor. I want to parse these json datas and convert to normal form. I just need data part of json objects.At result I should have 4 columns for every values in Data. I tried json_n…

Accepted Answer

If input data are in json file use:cols = ['Date','x','y','z']df = pd.DataFrame(pd.read_json('json.json', lines=True)['data'].tolist(), columns=cols)df['Date'] = pd.to_datetime(df['Date'], unit='s')print (df)                           Date         x         y         z0 2020-10-21 06:18:43.328814030  0.171875 -0.960938  0.0234381 2020-10-21 06:18:45.060513735  0.085938 -0.984375  0.0000002 2020-10-21 06:18:46.353275299  0.964979       NaN       NaN3 2020-10-21 06:18:47.698888779  0.039062 -1.000000  0.1250004 2020-10-21 06:18:48.853050232  0.078125 -0.992188  0.000000If input is DataFrame with column col:cols = ['Date','x','y','z']df = pd.DataFrame(pd.json_normalize(df['col'])['data'].tolist(), columns=cols)df['Date'] = pd.to_datetime(df['Date'], unit='s')print (df)                           Date         x         y         z0 2020-10-21 06:18:43.328814030  0.171875 -0.960938  0.0234381 2020-10-21 06:18:45.060513735  0.085938 -0.984375  0.0000002 2020-10-21 06:18:46.353275299  0.964979       NaN       NaN3 2020-10-21 06:18:47.698888779  0.039062 -1.000000  0.1250004 2020-10-21 06:18:48.853050232  0.078125 -0.992188  0.000000EDIT:Personally save csv like .xls is not good idea, because then read_excel raise weird error, but you can use:import astdf = pd.read_csv('15-10-2020-OO.xls')cols = ['Date','x','y','z']data = [x['data'] for x in df['Data'].apply(ast.literal_eval)]df = pd.DataFrame(data, columns=cols)df['Date'] = pd.to_datetime(df['Date'], unit='s')print (df)                              Date         x         y         z0    2020-10-15 07:21:16.159236193  0.085938 -0.972656  0.0039061    2020-10-15 07:21:17.597931385  0.089844 -0.968750  0.0039062    2020-10-15 07:21:18.838171959  0.089844 -0.972656  0.0039063    2020-10-15 07:21:20.338105917  0.085938 -0.972656  0.0039064    2020-10-15 07:21:21.768864155  0.089844 -0.984375  0.003906                           ...       ...       ...       ...8457 2020-10-15 08:59:57.907007933  0.085938 -0.972656  0.0039068458 2020-10-15 08:59:58.371274233  0.089844 -0.976562  0.0039068459 2020-10-15 08:59:58.833237648  0.085938 -0.976562  0.0039068460 2020-10-15 08:59:59.313337088  1.517057       NaN       NaN8461 2020-10-15 08:59:59.863240004  0.089844 -0.968750  0.007812[8462 rows x 4 columns]

Advertisement

Answer