I need to create a pandas dataframe using information from two different sources. For example,
for row in df.itertuples(): c1, c2, c3 = row.c1, row.c2, row.c3 returnedDict = function(row.c1, row.c2, row.c3)
The first 3 columns in the dataframe I want should contain c1, c2, c3
, and the rest of the columns come from the key of the returnedDict
. The number of keys in the returnedDict
is 100. How can I initialize such Dataframe and append the row in the dataframe at each for loop
?
Expected output
col1 col2 col3 key1 ... key100 -------------------------------------------------------------- c1 c2 c3 returnedDict[key1] returnedDict[key100]
Advertisement
Answer
If my understanding of your question is correct, here is an example of how to do it (Python >= 3.9.0):
import pandas as pd # Toy dataframe df = pd.DataFrame({"c1": [9, 4, 3], "c2": [2, 7, 1], "c3": [4, 6, 5]}) def function(c1, c2, c3): return {"key1": c1 * 10, "key2": c2 * 4, "key3": c3 * 2}
dfs = [] for row in df.itertuples(): c1, c2, c3 = row.c1, row.c2, row.c3 returnedDict = function(row.c1, row.c2, row.c3) dfs.append(pd.DataFrame({"col1": [c1], "col2": [c2], "col3": [c3]} | returnedDict)) df = pd.concat(dfs).reset_index(drop=True)
print(df) # Output col1 col2 col3 key1 key2 key3 0 9 2 4 90 8 8 1 4 7 6 40 28 12 2 3 1 5 30 4 10