I’m trying to get the unique available value for each site. The original pandas dataframe is with three columns:
| Site | Available | Capacity | 
|---|---|---|
| A | 7 | 20 | 
| A | 7 | 20 | 
| A | 8 | 20 | 
| B | 15 | 35 | 
| B | 15 | 35 | 
| C | 12 | 25 | 
| C | 12 | 25 | 
| C | 11 | 25 | 
and I want to get the unique available of each site. The desired table is like below:
| Site | Unique Available | 
|---|---|
| A | 7 | 
| 8 | |
| B | 15 | 
| C | 12 | 
| 11 | 
Advertisement
Answer
You can get the lists of unique available per site with GroupBy.unique()
>>> df.groupby('Site')['Available'].unique()
Site
A      [7, 8]
B        [15]
C    [12, 11]
Name: Available, dtype: object
Then with explode() you can expand these lists and with reset_index() get the index back to a column:
>>> df.groupby('Site')['Available'].unique().explode().reset_index()
  Site Available
0    A         7
1    A         8
2    B        15
3    C        12
4    C        11
Otherwise simply get both columns and remove duplicates:
>>> df[['Site', 'Available']].drop_duplicates() Site Available 0 A 7 2 A 8 3 B 15 5 C 12 7 C 11