Skip to content
Advertisement

Pandas To_Excel parsing problem – outputting only 1 file

Hello I have working code like this:

import pandas as pdfrom pandas.io.json import json_normalize
import json
import warnings
warnings.filterwarnings('ignore')

with open('yieldfull.json') as file:
    data = json.load(file)

df_json = json_normalize(data)
df_json_stripped = data[0]
platform_dict = df_json_stripped['result']
platform_names = []

for key in platform_dict:
    platform_names.append(key)

for name in platform_names:
    if name == 'Autofarm':
        vault_name_df = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.LPVaults.vaults'].items()]))[0])['name']
        current_token_0 = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.LPVaults.vaults'].items()]))[0])['LPInfo.currentToken0']
        current_token_1 = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.LPVaults.vaults'].items()]))[0])['LPInfo.currentToken1']
        df_json = pd.DataFrame({'Vault_Name':vault_name_df, 'Current_Token_0':current_token_0 , 'Current_Token_1':current_token_1})
        df_json.to_excel('Output_'+name+'.xlsx', index = False)
        platform_names.remove(name)
    elif name == 'Acryptos':
        vault_name_df = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.vaults.vaults'].items()]))[0])['name']
        price_USD = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.vaults.vaults'].items()]))[0])['priceInUSDDepositToken']
        current_token_0 = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.vaults.vaults'].items()]))[0])['currentTokens']
        deposited_token = json_normalize(pd.DataFrame(dict([(k, pd.Series(v)) for k,v in df_json['result.'+name+'.vaults.vaults'].items()]))[0])['depositedTokens']
        df_json = pd.DataFrame({'Vault_Name':vault_name_df, 'Price_USD':price_USD, 'Current_Token_0':current_token_0, 'Deposited_Token':deposited_token})
        df_json.to_excel('Output_'+name+'.xlsx', index = False)
    else:
        pass

Problem is: If I leave it like this it only outputs for first if. When I comment out that if section it will successfully output elif, but I can’t get it to output 2 files whatever I do. Any ideas?

Error I’m getting for Acryptos:

Traceback (most recent call last):
  File "C:UsersAdamPycharmProjectsScrapy_Thingsvenvlibsite-packagespandascoreindexesbase.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas_libsindex.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas_libsindex.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas_libshashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas_libshashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'result.Acryptos.vaults.vaults'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/Users/Adam/PycharmProjects/Scrapy_Things/yieldwatch/yieldwatch/spiders/JsonExcel.py", line 27, in <module>
    vault_name_df = json_normalize(pd.DataFrame(dict([(k , pd.Series(v)) for k,v in df_json['result.'+name+'.vaults.vaults'].items()]))[0])['name']
  File "C:UsersAdamPycharmProjectsScrapy_Thingsvenvlibsite-packagespandascoreframe.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:UsersAdamPycharmProjectsScrapy_Thingsvenvlibsite-packagespandascoreindexesbase.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 'result.Acryptos.vaults.vaults'

But if I comment out Autofarm and just process if for Acryptos is outputs excel just fine.

Advertisement

Answer

please remove the below line from your code

platform_names.remove(name)

debug code:

platform_names=['Autofarm','Acryptos']
for name in platform_names:
    if name == 'Autofarm':
        print("Autofarm")
        #platform_names.remove(name) # remove this line
    elif name == "Acryptos":
        print("Acryptos")

you have initially created

df_json = json_normalize(data)

and also in loop, you are overwriting it –>

df_json = pd.DataFrame({'Vault_Name':vault_name_df, 'Current_Token_0':current_token_0 , 'Current_Token_1':current_token_1}) 
df_json.to_excel('Output_'+name+'.xlsx', index = False) 

so change the name in loop and it will be okay.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement