Skip to content
Advertisement

Create pandas dataframe from multiple sources

I need to create a pandas dataframe using information from two different sources. For example,

for row in df.itertuples():
    c1, c2, c3 = row.c1, row.c2, row.c3
    returnedDict = function(row.c1, row.c2, row.c3)

The first 3 columns in the dataframe I want should contain c1, c2, c3, and the rest of the columns come from the key of the returnedDict. The number of keys in the returnedDict is 100. How can I initialize such Dataframe and append the row in the dataframe at each for loop?

Expected output

col1   col2   col3    key1      ...       key100
--------------------------------------------------------------
c1     c2     c3      returnedDict[key1]  returnedDict[key100]

Advertisement

Answer

If my understanding of your question is correct, here is an example of how to do it (Python >= 3.9.0):

import pandas as pd

# Toy dataframe
df = pd.DataFrame({"c1": [9, 4, 3], "c2": [2, 7, 1], "c3": [4, 6, 5]})

def function(c1, c2, c3):
    return {"key1": c1 * 10, "key2": c2 * 4, "key3": c3 * 2}
dfs = []
for row in df.itertuples():
    c1, c2, c3 = row.c1, row.c2, row.c3
    returnedDict = function(row.c1, row.c2, row.c3)
    dfs.append(pd.DataFrame({"col1": [c1], "col2": [c2], "col3": [c3]} | returnedDict))

df = pd.concat(dfs).reset_index(drop=True)
print(df)
# Output
   col1  col2  col3  key1  key2  key3
0     9     2     4    90     8     8
1     4     7     6    40    28    12
2     3     1     5    30     4    10
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement