I have a DataFrame defined like this:
JavaScript
x
15
15
1
from collections import OrderedDict
2
from pandas import DataFrame
3
import pandas as pd
4
import numpy as np
5
6
table = OrderedDict((
7
('year', [1900, 1900, 1900, 1900, 1901, 1901, 1901, 1901]),
8
('variable',['PRCP', 'PRCP', 'TAVG', 'TAVG', 'PRCP', 'PRCP', 'TAVG', 'TAVG']),
9
('month', [1, 2, 1, 2, 1, 2, 1, 2]),
10
('first_day', [5, 8, 7, 3, 9, 2, 4, 1]),
11
('second_day', [5, 8, 7, 3, 9, 2, 5, 8]),
12
('third_day', [1, 7, 5, 7, 3, 5, 8, 9])
13
))
14
df = DataFrame(table)
15
The DataFrame is now this:
JavaScript
1
10
10
1
year variable month first_day second_day third_day
2
0 1900 PRCP 1 5 5 1
3
1 1900 PRCP 2 8 8 7
4
2 1900 TAVG 1 7 7 5
5
3 1900 TAVG 2 3 3 7
6
4 1901 PRCP 1 9 9 3
7
5 1901 PRCP 2 2 2 5
8
6 1901 TAVG 1 4 5 8
9
7 1901 TAVG 2 1 8 9
10
I want to pivot the DataFrame so that it then looks like this:
JavaScript
1
26
26
1
variable year month day value
2
0 PRCP 1900 1 1 5
3
1 PRCP 1900 1 2 5
4
2 PRCP 1900 1 3 1
5
3 PRCP 1900 2 1 8
6
4 PRCP 1900 2 2 8
7
5 PRCP 1900 2 3 7
8
6 PRCP 1901 1 1 7
9
7 PRCP 1901 1 2 7
10
8 PRCP 1901 1 3 5
11
9 PRCP 1901 2 1 3
12
10 PRCP 1901 2 2 3
13
11 PRCP 1901 2 3 7
14
12 TAVG 1900 1 1 9
15
13 TAVG 1900 1 2 9
16
14 TAVG 1900 1 3 3
17
15 TAVG 1900 2 1 2
18
16 TAVG 1900 2 2 2
19
17 TAVG 1900 2 3 5
20
18 TAVG 1901 1 1 4
21
19 TAVG 1901 1 2 5
22
20 TAVG 1901 1 3 8
23
21 TAVG 1901 2 1 1
24
22 TAVG 1901 2 2 8
25
23 TAVG 1901 2 3 9
26
I think I want to do this via pivoting, but I’ve not yet worked out how to do this using the pivot()
or pivot_table()
functions. How can I do this, with or without using a pivot?
Advertisement
Answer
You can use melt
, but first rename
columns by dict
:
JavaScript
1
30
30
1
d = {'first_day':1,'second_day':2,'third_day':3}
2
df = pd.melt(df.rename(columns=d), id_vars=['variable','year','month'], var_name='day')
3
df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True)
4
print (df)
5
variable year month day value
6
0 PRCP 1900 1 1 5
7
1 PRCP 1900 1 2 5
8
2 PRCP 1900 1 3 1
9
3 PRCP 1900 2 1 8
10
4 PRCP 1900 2 2 8
11
5 PRCP 1900 2 3 7
12
6 PRCP 1901 1 1 9
13
7 PRCP 1901 1 2 9
14
8 PRCP 1901 1 3 3
15
9 PRCP 1901 2 1 2
16
10 PRCP 1901 2 2 2
17
11 PRCP 1901 2 3 5
18
12 TAVG 1900 1 1 7
19
13 TAVG 1900 1 2 7
20
14 TAVG 1900 1 3 5
21
15 TAVG 1900 2 1 3
22
16 TAVG 1900 2 2 3
23
17 TAVG 1900 2 3 7
24
18 TAVG 1901 1 1 4
25
19 TAVG 1901 1 2 5
26
20 TAVG 1901 1 3 8
27
21 TAVG 1901 2 1 1
28
22 TAVG 1901 2 2 8
29
23 TAVG 1901 2 3 9
30
Or map
column day
by dict
:
JavaScript
1
31
31
1
d = {'first_day':1,'second_day':2,'third_day':3}
2
df = pd.melt(df, id_vars=['variable','year','month'], var_name='day')
3
df.day = df.day.map(d)
4
df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True)
5
print (df)
6
variable year month day value
7
0 PRCP 1900 1 1 5
8
1 PRCP 1900 1 2 5
9
2 PRCP 1900 1 3 1
10
3 PRCP 1900 2 1 8
11
4 PRCP 1900 2 2 8
12
5 PRCP 1900 2 3 7
13
6 PRCP 1901 1 1 9
14
7 PRCP 1901 1 2 9
15
8 PRCP 1901 1 3 3
16
9 PRCP 1901 2 1 2
17
10 PRCP 1901 2 2 2
18
11 PRCP 1901 2 3 5
19
12 TAVG 1900 1 1 7
20
13 TAVG 1900 1 2 7
21
14 TAVG 1900 1 3 5
22
15 TAVG 1900 2 1 3
23
16 TAVG 1900 2 2 3
24
17 TAVG 1900 2 3 7
25
18 TAVG 1901 1 1 4
26
19 TAVG 1901 1 2 5
27
20 TAVG 1901 1 3 8
28
21 TAVG 1901 2 1 1
29
22 TAVG 1901 2 2 8
30
23 TAVG 1901 2 3 9
31
Another solution with stack
:
JavaScript
1
32
32
1
d = {'first_day':1,'second_day':2,'third_day':3}
2
df = df.rename(columns=d).set_index(['variable','year','month'])
3
.stack()
4
.reset_index(name='value')
5
.rename(columns={'level_3':'day'})
6
print (df)
7
variable year month day value
8
0 PRCP 1900 1 1 5
9
1 PRCP 1900 1 2 5
10
2 PRCP 1900 1 3 1
11
3 PRCP 1900 2 1 8
12
4 PRCP 1900 2 2 8
13
5 PRCP 1900 2 3 7
14
6 TAVG 1900 1 1 7
15
7 TAVG 1900 1 2 7
16
8 TAVG 1900 1 3 5
17
9 TAVG 1900 2 1 3
18
10 TAVG 1900 2 2 3
19
11 TAVG 1900 2 3 7
20
12 PRCP 1901 1 1 9
21
13 PRCP 1901 1 2 9
22
14 PRCP 1901 1 3 3
23
15 PRCP 1901 2 1 2
24
16 PRCP 1901 2 2 2
25
17 PRCP 1901 2 3 5
26
18 TAVG 1901 1 1 4
27
19 TAVG 1901 1 2 5
28
20 TAVG 1901 1 3 8
29
21 TAVG 1901 2 1 1
30
22 TAVG 1901 2 2 8
31
23 TAVG 1901 2 3 9
32