I'm trying to build a dataframe using for loop, below start works perfectly: And I got the correct one: Then I tried to make my implemetation as below: But the result I got was a horizontal dataframe, not a vertical one Even the data in the main hedears got posted as NAN? I tried using enforced header type declaration, but

My dataframe is adding columns instead of rows

I’m trying to build a dataframe using for loop, below start works perfectly:

import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])

for i in range(2):
  this_column = df.columns[i]
  df[this_column] = [i, i+1]

df

JavaScript
​x
 
import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
​
for i in range(2):
  this_column = df.columns[i]
  df[this_column] = [i, i+1]
​
df
​

And I got the correct one:

Then I tried to make my implemetation as below:

import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])

SLS = [58, 100]

row = 0
for _, slc in enumerate(SLS):
  for single_date in daterange(start_date, end_date):
    df[row] = [single_date.strftime("%Y-%m-%d"), slc]
    row = row + 1

print(type(row), type(df))
df

JavaScript
 
import pandas as pd
df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
​
SLS = [58, 100]
​
row = 0
for _, slc in enumerate(SLS):
  for single_date in daterange(start_date, end_date):
    df[row] = [single_date.strftime("%Y-%m-%d"), slc]
    row = row + 1
​
print(type(row), type(df))
df
​

But the result I got was a horizontal dataframe, not a vertical one

Even the data in the main hedears got posted as NAN?

I tried using enforced header type declaration, but gave same result:

import pandas as pd
import numpy as np
#Create empty DataFrame with specific column names & types
# Using NumPy
dtypes = np.dtype(
    [
        ('DATE1',np.datetime64),
        ('SLS_CNTR_ID', int),     
    ]
)
df = pd.DataFrame(np.empty(0, dtype=dtypes))
#df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])

print(df)

SLS = [58, 100]

row = 0
for _, slc in enumerate(SLS):
  for single_date in daterange(start_date, end_date):
    df[row] = [single_date.strftime("%Y-%m-%d"), slc]
    row = row + 1

print(type(row), type(df))
df

JavaScript
 
import pandas as pd
import numpy as np
#Create empty DataFrame with specific column names & types
# Using NumPy
dtypes = np.dtype(
    [
        ('DATE1',np.datetime64),
        ('SLS_CNTR_ID', int),     
    ]
)
df = pd.DataFrame(np.empty(0, dtype=dtypes))
#df = pd.DataFrame(columns=['DATE1', 'SLS_CNTR_ID'])
​
print(df)
​
SLS = [58, 100]
​
row = 0
for _, slc in enumerate(SLS):
  for single_date in daterange(start_date, end_date):
    df[row] = [single_date.strftime("%Y-%m-%d"), slc]
    row = row + 1
​
print(type(row), type(df))
df
​

Answer

Use df.loc[row] instead of df[row] to set the rows.

Though I’d rather implement this using a merge instead of the loops:

(pd.DataFrame({"DATE1": pd.date_range("2020-01-01", "2020-02-01")})
     .merge(pd.Series(SLS, name="SLS_CNTR_ID"), how="cross"))

JavaScript
 
(pd.DataFrame({"DATE1": pd.date_range("2020-01-01", "2020-02-01")})
     .merge(pd.Series(SLS, name="SLS_CNTR_ID"), how="cross"))
​

Or leverage itertools to obtain the cross-product:

import itertools

dates = pd.date_range("2020-01-01", "2020-02-01")
SLS = [58, 100]

pd.DataFrame(itertools.product(SLS, dates), columns=["SLS_CNTR_ID", "DATE1"])

    SLS_CNTR_ID      DATE1
0            58 2020-01-01
1            58 2020-01-02
2            58 2020-01-03
3            58 2020-01-04
4            58 2020-01-05
..          ...        ...
59          100 2020-01-28
60          100 2020-01-29
61          100 2020-01-30
62          100 2020-01-31
63          100 2020-02-01

[64 rows x 2 columns]

JavaScript
 
import itertools
​
dates = pd.date_range("2020-01-01", "2020-02-01")
SLS = [58, 100]
​
pd.DataFrame(itertools.product(SLS, dates), columns=["SLS_CNTR_ID", "DATE1"])
​
    SLS_CNTR_ID      DATE1
0            58 2020-01-01
1            58 2020-01-02
2            58 2020-01-03
3            58 2020-01-04
4            58 2020-01-05
..          ...        ...
59          100 2020-01-28
60          100 2020-01-29
61          100 2020-01-30
62          100 2020-01-31
63          100 2020-02-01
​
[64 rows x 2 columns]
​

Advertisement

Answer