# Factor out the name of the dataframe in python pandas to get better to read mathematical expressions

If you do for example mathematical operations with columns of a python pandas dataframe (call it `data`), you repeatedly have to write `data` do access the columns, which is very annoying, if you want nice to read mathematical formulas. So I am looking for a way to “factor out” the `data` keyword. Consider this simple example:

```import pandas as pd
from numpy import *

k = 3

data['a4'] = data.a1 + data.a2
data['a5'] = sqrt(data.a3)*k

## Imagine much more complex mathematical operations

## instead of this I want something like this pseudocode:

## cd data
## a4 = a1 + a2
## a5 = sqrt(a3)*k
## end cd data
```

Where `data.dat` is

```a1,a2,a3
1,2,3
4,5,6
7,8,9
```

You can use `pandas.DataFrame.eval`:

```>>> df
a1  a2  a3
0   1   2   3
1   4   5   6
2   7   8   9

>>> k = 3

>>> df = df.eval('a4 = a1 + a2')

>>> df = df.eval('a5 = a3**2 * @k')

>>> df

a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243
```

If you want to put all on same line, you can do so:

```>>> df
a1  a2  a3
0   1   2   3
1   4   5   6
2   7   8   9

>>> k = 3

>>> df.eval('''
a4 = a1 + a2
a5 = a3**2 * @k
''')
a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243

# Alternatively you can also store the expr in a string and then pass the string:
>>> expr = '''
a4 = a1 + a2
a5 = a3**2 * @k
'''
>>> df.eval(expr)
a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243
```
User contributions licensed under: CC BY-SA
2 People found this is helpful