I’m trying to build a dataframe using for loop, below start works perfectly:
import pandas as pd df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID']) for i in range(2): this_column = df.columns[i] df[this_column] = [i, i+1] df
And I got the correct one:
Then I tried to make my implemetation as below:
import pandas as pd df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID']) SLS = [58, 100] row = 0 for _, slc in enumerate(SLS): for single_date in daterange(start_date, end_date): df[row] = [single_date.strftime("%Y-%m-%d"), slc] row = row + 1 print(type(row), type(df)) df
But the result I got was a horizontal dataframe, not a vertical one
Even the data in the main hedears got posted as NAN
?
I tried using enforced header type declaration, but gave same result:
import pandas as pd import numpy as np #Create empty DataFrame with specific column names & types # Using NumPy dtypes = np.dtype( [ ('DATE1',np.datetime64), ('SLS_CNTR_ID', int), ] ) df = pd.DataFrame(np.empty(0, dtype=dtypes)) #df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID']) print(df) SLS = [58, 100] row = 0 for _, slc in enumerate(SLS): for single_date in daterange(start_date, end_date): df[row] = [single_date.strftime("%Y-%m-%d"), slc] row = row + 1 print(type(row), type(df)) df
Advertisement
Answer
Use df.loc[row]
instead of df[row]
to set the rows.
Though I’d rather implement this using a merge instead of the loops:
(pd.DataFrame({"DATE1": pd.date_range("2020-01-01", "2020-02-01")}) .merge(pd.Series(SLS, name="SLS_CNTR_ID"), how="cross"))
Or leverage itertools
to obtain the cross-product:
import itertools dates = pd.date_range("2020-01-01", "2020-02-01") SLS = [58, 100] pd.DataFrame(itertools.product(SLS, dates), columns=["SLS_CNTR_ID", "DATE1"]) SLS_CNTR_ID DATE1 0 58 2020-01-01 1 58 2020-01-02 2 58 2020-01-03 3 58 2020-01-04 4 58 2020-01-05 .. ... ... 59 100 2020-01-28 60 100 2020-01-29 61 100 2020-01-30 62 100 2020-01-31 63 100 2020-02-01 [64 rows x 2 columns]