Pandas group by unique ID and Distinct date per unique ID

Question

Title may be confusing: I have a dataframe that displays user_id sign in&#8217;s during the week. My goal is to display the de-duped ID along with the de-duped dates per employee, in order to get a count of # days the user uniquely signed in for the week. So I&#8217;ve been trying to enforce a rule to make su…

Accepted Answer

calculate start of weekthen it&#8217;s a simple use of count()df = pd.read_csv(io.StringIO("""ID      date    # days signed in for week     10301  1/4/2021    610301  1/4/2021    610301  1/5/2021    610301  1/6/2021    610301  1/7/2021    610301  1/8/2021    610302  1/4/2021    510302  1/5/2021    510302  1/6/2021    510302  1/7/2021    510302  1/8/2021    5"""), sep="ss+", engine="python")df.date = pd.to_datetime(df.date)df["weekStart"] = df['date'] - pd.to_timedelta(df['date'].dt.dayofweek, unit='d')df.groupby(["ID","weekStart"])["date"].count().reset_index().rename(columns={"weekStart":"date","date":"# days signed in for week"})

Advertisement

Answer