I am using pandas groupby and want to apply the function to make a set from the items in the group.
The following results in TypeError: 'type' object is not iterable
:
JavaScript
x
2
1
df = df.groupby('col1')['col2'].agg({'size': len, 'set': set})
2
But the following works:
JavaScript
1
5
1
def to_set(x):
2
return set(x)
3
4
df = df.groupby('col1')['col2'].agg({'size': len, 'set': to_set})
5
In my understanding the two expression are similar, what is the reason why the first does not work?
Advertisement
Answer
Update
- As late as pandas version 0.22, this is an issue.
- As of pandas version 1.1.2, this is not an issue. Aggregating
set
, doesn’t result inTypeError: 'type' object is not iterable
.- Not certain when the functionality was updated.
Original Answer
It’s because set
is of type
type
whereas to_set
is of type
function
:
JavaScript
1
10
10
1
type(set)
2
<class 'type'>
3
4
def to_set(x):
5
return set(x)
6
7
type(to_set)
8
9
<class 'function'>
10
According to the docs, .agg()
expects:
arg :
function
ordict
Function to use for aggregating groups.
- If a
function
, must either work when passed aDataFrame
or when passed toDataFrame.apply
.
- If passed a
dict
, the keys must beDataFrame
column names.
Accepted Combinations are:
string
cythonized function namefunction
list
of functions
dict
of columns -> functions
- nested
dict
of names -> dicts of functions