OK, I have a dataframe that looks like the following:
>> df id trip_id segment_id session_id start_timestamp lat_start lon_start lat_end lon_end travelmode 563097015 563097 15 128618 2017-05-20 17:47:12+01 41.1783308 -8.5949878 41.1784478 -8.5948463 0 563097013 563097 13 128618 2017-05-20 17:45:29+01 41.1781344 -8.5951169 41.1782919 -8.5950689 0 563097011 563097 11 128618 2017-05-20 17:43:41+01 41.1781196 -8.5954075 41.1782139 -8.5950689 0 563097009 563097 9 128618 2017-05-20 17:41:48+01 41.1782497 -8.595197 41.1781101 -8.5954124 0 563097003 563097 3 128618 2017-05-20 17:10:29+01 41.1832512 -8.6081606 41.1782561 -8.5950259 0
In SQL, to filter unique segments (segment_id
) by travelmode
I will do:
SELECT travelmode, COUNT(DISTINCT segment_id) AS NumOfSegments FROM df_table GROUP BY travelmode
What is the pandas equivalent of this expression?
Advertisement
Answer
Maybe:
df.groupby('travelmode').segment_id.nunique()
as suggested in this post.