I am working with the pandas library and I want to add two new columns to a dataframe df with n columns (n > 0).
These new columns result from the application of a function to one of the columns in the dataframe.
The function to apply is like:
def calculate(x):
...operate...
return z, y
One method for creating a new column for a function returning only a value is:
df['new_col']) = df['column_A'].map(a_function)
So, what I want, and tried unsuccesfully (*), is something like:
(df['new_col_zetas'], df['new_col_ys']) = df['column_A'].map(calculate)
What the best way to accomplish this could be ? I scanned the documentation with no clue.
**df['column_A'].map(calculate) returns a pandas Series each item consisting of a tuple z, y. And trying to assign this to two dataframe columns produces a ValueError.*
Advertisement
Answer
I’d just use zip:
In [1]: from pandas import *
In [2]: def calculate(x):
...: return x*2, x*3
...:
In [3]: df = DataFrame({'a': [1,2,3], 'b': [2,3,4]})
In [4]: df
Out[4]:
a b
0 1 2
1 2 3
2 3 4
In [5]: df["A1"], df["A2"] = zip(*df["a"].map(calculate))
In [6]: df
Out[6]:
a b A1 A2
0 1 2 2 3
1 2 3 4 6
2 3 4 6 9