Skip to content
Advertisement

Pandasql Exception with OVER

I tried to use this line of code :

JavaScript

NB : delete ‘locals()’ or replace by ‘globals()’ don’t solve the problem.

But I have this error :

JavaScript

http://sqlalche.me/e/14/e3q8 (Same error with Average)

I load my dataframe like this :

JavaScript

My file is in the same directory as my Jupiter notebook file. I follow the link, but I still don’t understand what’s wrong.

And this line work :

JavaScript

So I guess is not a problem with my pandasql

My initial goal with this code is :

I want to know how many transactions a card number do for every 30 min interval, and also calculate in average how many transactions a client do per 30 minutes.

For example, I want to have something like (order by Card Number) :

Card Number interval #transactions_30min AVG_transactions_30min
536518******2108 2015-05-11 00:00:01 – 00:30:00 2 5
536518******2108 2015-05-11 00:30:01 – 01:00:00 8 5
536518******2197 2015-05-11 00:00:01 – 00:30:00 6 8
536518******2197 2015-05-11 00:30:01 – 01:00:00 10 8

My database look like this :

id Card Number Date Amount Fraud
0 536518******2108 2015-05-01 00:01:54 36.54 No
1 536518******2108 2015-05-01 00:03:46 36.54 No
2 453211******1239 2015-05-01 00:08:50 69.00 No
3 548827******1705 2015-05-01 00:27:00 193.43 No
4 531681******9778 2015-05-01 01:32:46 132.00 No
5 515117******4107 2015-05-01 02:10:26 161.00 No
6 432032******9111 2015-05-01 08:09:15 110.00 No
7 544540******7141 2015-05-01 08:30:16 159.50 No
8 554906******0358 2015-05-01 09:13:51 126.50 Yes
9 554906******0358 2015-05-01 09:15:28 126.50 Yes

So is it possible to use pandasql in my case ? If yes, how can I correct my request to make it work ?

Advertisement

Answer

If you don’t mind using pandas for all calculations, here is one approach:

JavaScript

Result:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement