Skip to content
Advertisement

Create a DataFrame from list in lists (Pandas)

I´m having trouble creating a dataframe on my list.

The list contains four columns, but instead it says on presente one column with data:

ValueError: 4 columns passed, passed data had 1 columns.

The list itself is presented in this way:

[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]

I know there is something happening due to the double [], but i can´t figure it out. Can´t someone help me?

Here is the code so far:

   for i in range(6):
    excel_file = pd.read_excel(input_file, sheet_name=sheet[i])
    excel_file = excel_file.values.tolist()

    filtered = [x for x in excel_file if 'TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)' in x
                or 'TOTAL DAS DESPESAS DE CUSTEIO (A)' in x
                ]

    sheet_file = sheet[i]
    sheet_variable.append(sheet_file)
    wb_name.append(file_name)
    conab_data.append(filtered)

    print(filtered)

df_conab = pd.DataFrame(conab_data, columns=['Descrição', 'Preço/ha', 'Scs/ha', 'Part. %'])
df_conab['Local/UF/Ano'] = sheet_variable
df_conab['Fonte'] = wb_name

print(df_conab)

Advertisement

Answer

you could fix this with a for loop

overly_nested = [[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]]

for i, sub_list in enumerate(overly_nested):
    overly_nested[i]=sub_list[0]
df = pd.DataFrame(overly_nested)
print(df)

I’m sure theres a way to do this with zip(), let me experiment and I’ll edit if I find it

Advertisement