I’m trying to get the unique available value for each site. The original pandas dataframe is with three columns:
Site | Available | Capacity |
---|---|---|
A | 7 | 20 |
A | 7 | 20 |
A | 8 | 20 |
B | 15 | 35 |
B | 15 | 35 |
C | 12 | 25 |
C | 12 | 25 |
C | 11 | 25 |
and I want to get the unique available of each site. The desired table is like below:
Site | Unique Available |
---|---|
A | 7 |
8 | |
B | 15 |
C | 12 |
11 |
Advertisement
Answer
You can get the lists of unique available per site with GroupBy.unique()
JavaScript
x
7
1
>>> df.groupby('Site')['Available'].unique()
2
Site
3
A [7, 8]
4
B [15]
5
C [12, 11]
6
Name: Available, dtype: object
7
Then with explode()
you can expand these lists and with reset_index()
get the index back to a column:
JavaScript
1
8
1
>>> df.groupby('Site')['Available'].unique().explode().reset_index()
2
Site Available
3
0 A 7
4
1 A 8
5
2 B 15
6
3 C 12
7
4 C 11
8
Otherwise simply get both columns and remove duplicates:
JavaScript
1
8
1
>>> df[['Site', 'Available']].drop_duplicates()
2
Site Available
3
0 A 7
4
2 A 8
5
3 B 15
6
5 C 12
7
7 C 11
8