a silly question for most of you.
I have this list:
In DT_INI DT_FIM Status Description jobName 0 IN100 01/01/2022 01/02/2022 Encerrado Abend no job XX_01 XX_01 1 IN200 01/02/2022 01/03/2022 Encerrado Abend no job XX_01 XX_01 2 IN300 01/03/2022 01/04/2022 Encerrado Abend no job XX_02 XX_02
I need to count how many Ins a jobName has in this list, get this count and populate a new column named Qt_Ins
.
it should looks like this:
In jobName DT_INI DT_FIM Status Description Qt_Ins 0 IN100 XX_01 01/01/2022 01/02/2022 Encerrado Abend no job XX_01 2 1 IN200 XX_01 01/02/2022 01/03/2022 Encerrado Abend no job XX_01 2 2 IN300 XX_02 01/03/2022 01/04/2022 Encerrado Abend no job XX_02 1
Could you guys help me again?
Thanks
Advertisement
Answer
Input:
import pandas as pd from io import StringIO s = """ In DT_INI DT_FIM Status Description jobName 0 IN100 01/01/2022 01/02/2022 Encerrado Abend no job XX_01 XX_01 1 IN200 01/02/2022 01/03/2022 Encerrado Abend no job XX_01 XX_01 2 IN300 01/03/2022 01/04/2022 Encerrado Abend no job XX_02 XX_02""" df = pd.read_table(StringIO(s), "ss+", engine="python")
You can use groupby
with nunique
:
# Store columns cols = list(df.columns) df.set_index("jobName", inplace=True) # Do the group by df["Qt_Ins"] = df.groupby("jobName")["In"].nunique() # Re-order columns df = df.reset_index()[cols + ["Qt_Ins"]] print(df.to_string())
Output:
In DT_INI DT_FIM Status Description jobName Qt_Ins 0 IN100 01/01/2022 01/02/2022 Encerrado Abend no job XX_01 XX_01 2 1 IN200 01/02/2022 01/03/2022 Encerrado Abend no job XX_01 XX_01 2 2 IN300 01/03/2022 01/04/2022 Encerrado Abend no job XX_02 XX_02 1