I am trying to create a list of DataFrames in a for loop, and then concatenate them outside of the for loop and write to excel. I have gotten the code to be close, but for some reason the column created inside the for loop ends up being the final value rather each individual value as it loops through. I am sure this is extremely simple and I’m just missing something easy but cannot seem to figure it out tonight. Here is some dummy code to illustrate and help with problem solving. At the end, I’d like an excel output with wks 1 – 9. However, in my actual output all values in the wk column are equal to 9.
import pandas as pd sample = [] area = pd.Series({'California':423967, 'Texas':695662}) pop = pd.Series({'California':38332521, 'Texas':26448193}) data = pd.DataFrame({'area':area, 'pop':pop}) print(data) for i in range(10): data['wk'] = i sample.append(data) print(data) res = pd.concat(sample) res.to_excel("Concatenating DataFrames.xlsx", index=False)
Advertisement
Answer
The problem lies here that when you append the data to list it does not create the separate memory for it rather they all refer to the same memory that’s why after last updation in the loop it becomes 9 or final value.
just replace
sample.append(data)
with
sample.append(data.copy())
.copy() will enforce pandas to create new memory for each df in loop