Skip to content
Advertisement

How can I increase the maximum query time?

I ran a query which will eventually return roughly 17M rows in chunks of 500,000. Everything seemed to be going just fine, but I ran into the following error:

Traceback (most recent call last):
File "sql_csv.py", line 22, in <module>
    for chunk in  pd.read_sql_query(hours_query, db.conn, chunksize = 500000):
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/pandas/io/sql.py", line 1424, in _query_iterator
    data = cursor.fetchmany(chunksize)
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/jaydebeapi/__init__.py", line 546, in fetchmany
    row = self.fetchone()
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/jaydebeapi/__init__.py", line 526, in fetchone
    if not self._rs.next(): jpype._jexception.SQLExceptionPyRaisable: java.sql.SQLException: Query failed (#20171013_015410_01255_8pff8):
**Query exceeded maximum time limit of 60.00m**

Obviously such a query can be expected to take some time; I’m fine with this (and chunking means I know I won’t be breaking any RAM limitations — in fact the file output I was running shows the query finished 16M of the 17M rows before crashing!).

But I don’t see any direct options for read_sql_query. params seems like a decent candidate, but I can’t see in the jaydebeapi documentation any hint of what the right parameter to give to execute might be.

How can I overcome this and run my full query?

Advertisement

Answer

When executing queries, Presto restricts each query by CPU, memory, execution time and other constraints. You hit execution time limit. Please ensure that your query is sound, otherwise, you can crash the cluster.

To increase query execution time, define a new value in session variables.

SET SESSION query_max_execution_time=60m;
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement