I have a DataFrame with four columns: “date”, “time_gap”, “count” and “average_speed”.
I’d like to set values to the count column when requirements are met based on the “date” and “time_gap” columns.
So, for example, if I’m running this query:
random_row = df.query("date == '2018-12-07' & time_gap == 86")
It’s returning this as output:
date time_gap count average_speed 282 2018-12-07 86 0 0
Let’s say I want to change the value in the count columns with 12, how could I do it?
I’ve tried this:
random_row = df.query("date == '2018-12-07' & time_gap == 86")["count"].replace(0, 12)
Which returns this:
282 12 Name: count, dtype: int64
But when I’m having a look at the df:
df.iloc[282]
I still have my row where the “count” is equal to 0:
date 2018-12-07 00:00:00 time_gap 86 count 0 average_speed 0 Name: 282, dtype: object
How can I do it?
Advertisement
Answer
You can do it with loc
, if you don’t want to use NumPy:
df.loc[ (df.date.eq('07/12/2018')) & (df.time_gap.eq(86)), 'count' ] = 12
prints:
date time_gap count average_speed 0 07/12/201 8 86 12 0
Yes, but in order to do that you have to use eval
, which takes the expression passed in query
, and evaluates it:
qr = "date == '07/12/2018' & time_gap == 86" df.loc[df.eval(qr), 'count'] = 12
prints:
date time_gap count average_speed 0 07/12/2018 86 12 0
You can see practical applications of eval
here.