If you do for example mathematical operations with columns of a python pandas dataframe (call it data
), you repeatedly have to write data
do access the columns, which is very annoying, if you want nice to read mathematical formulas. So I am looking for a way to “factor out” the data
keyword. Consider this simple example:
JavaScript
x
19
19
1
import pandas as pd
2
from numpy import *
3
4
k = 3
5
data = pd.read_csv('data.dat',sep=',')
6
7
data['a4'] = data.a1 + data.a2
8
data['a5'] = sqrt(data.a3)*k
9
10
## Imagine much more complex mathematical operations
11
12
13
## instead of this I want something like this pseudocode:
14
15
## cd data
16
## a4 = a1 + a2
17
## a5 = sqrt(a3)*k
18
## end cd data
19
Where data.dat
is
JavaScript
1
5
1
a1,a2,a3
2
1,2,3
3
4,5,6
4
7,8,9
5
Advertisement
Answer
You can use pandas.DataFrame.eval
:
JavaScript
1
19
19
1
>>> df
2
a1 a2 a3
3
0 1 2 3
4
1 4 5 6
5
2 7 8 9
6
7
>>> k = 3
8
9
>>> df = df.eval('a4 = a1 + a2')
10
11
>>> df = df.eval('a5 = a3**2 * @k')
12
13
>>> df
14
15
a1 a2 a3 a4 a5
16
0 1 2 3 3 27
17
1 4 5 6 9 108
18
2 7 8 9 15 243
19
If you want to put all on same line, you can do so:
JavaScript
1
28
28
1
>>> df
2
a1 a2 a3
3
0 1 2 3
4
1 4 5 6
5
2 7 8 9
6
7
>>> k = 3
8
9
>>> df.eval('''
10
a4 = a1 + a2
11
a5 = a3**2 * @k
12
''')
13
a1 a2 a3 a4 a5
14
0 1 2 3 3 27
15
1 4 5 6 9 108
16
2 7 8 9 15 243
17
18
# Alternatively you can also store the expr in a string and then pass the string:
19
>>> expr = '''
20
a4 = a1 + a2
21
a5 = a3**2 * @k
22
'''
23
>>> df.eval(expr)
24
a1 a2 a3 a4 a5
25
0 1 2 3 3 27
26
1 4 5 6 9 108
27
2 7 8 9 15 243
28