I have a DataFrame with four columns: “date”, “time_gap”, “count” and “average_speed”.
I’d like to set values to the count column when requirements are met based on the “date” and “time_gap” columns.
So, for example, if I’m running this query:
JavaScript
x
2
1
random_row = df.query("date == '2018-12-07' & time_gap == 86")
2
It’s returning this as output:
JavaScript
1
3
1
date time_gap count average_speed
2
282 2018-12-07 86 0 0
3
Let’s say I want to change the value in the count columns with 12, how could I do it?
I’ve tried this:
JavaScript
1
2
1
random_row = df.query("date == '2018-12-07' & time_gap == 86")["count"].replace(0, 12)
2
Which returns this:
JavaScript
1
3
1
282 12
2
Name: count, dtype: int64
3
But when I’m having a look at the df:
JavaScript
1
2
1
df.iloc[282]
2
I still have my row where the “count” is equal to 0:
JavaScript
1
6
1
date 2018-12-07 00:00:00
2
time_gap 86
3
count 0
4
average_speed 0
5
Name: 282, dtype: object
6
How can I do it?
Advertisement
Answer
You can do it with loc
, if you don’t want to use NumPy:
JavaScript
1
2
1
df.loc[ (df.date.eq('07/12/2018')) & (df.time_gap.eq(86)), 'count' ] = 12
2
prints:
JavaScript
1
4
1
date time_gap count average_speed
2
0 07/12/201
3
8 86 12 0
4
Yes, but in order to do that you have to use eval
, which takes the expression passed in query
, and evaluates it:
JavaScript
1
3
1
qr = "date == '07/12/2018' & time_gap == 86"
2
df.loc[df.eval(qr), 'count'] = 12
3
prints:
JavaScript
1
3
1
date time_gap count average_speed
2
0 07/12/2018 86 12 0
3
You can see practical applications of eval
here.