Skip to content
Advertisement

Pandas: create two new columns in a dataframe with values calculated from a pre-existing column

I am working with the pandas library and I want to add two new columns to a dataframe df with n columns (n > 0).
These new columns result from the application of a function to one of the columns in the dataframe.

The function to apply is like:

def calculate(x):
    ...operate...
    return z, y

One method for creating a new column for a function returning only a value is:

df['new_col']) = df['column_A'].map(a_function)

So, what I want, and tried unsuccesfully (*), is something like:

(df['new_col_zetas'], df['new_col_ys']) = df['column_A'].map(calculate)

What the best way to accomplish this could be ? I scanned the documentation with no clue.

**df['column_A'].map(calculate) returns a pandas Series each item consisting of a tuple z, y. And trying to assign this to two dataframe columns produces a ValueError.*

Advertisement

Answer

I’d just use zip:

In [1]: from pandas import *

In [2]: def calculate(x):
   ...:     return x*2, x*3
   ...: 

In [3]: df = DataFrame({'a': [1,2,3], 'b': [2,3,4]})

In [4]: df
Out[4]: 
   a  b
0  1  2
1  2  3
2  3  4

In [5]: df["A1"], df["A2"] = zip(*df["a"].map(calculate))

In [6]: df
Out[6]: 
   a  b  A1  A2
0  1  2   2   3
1  2  3   4   6
2  3  4   6   9
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement