I want to remove duplicate data from the database, This statement can be executed in navicat.
delete from Proxy_Main where (Proxy_Main.ip,Proxy_Main.port) in (select ip,port from Proxy_Main group by ip,port having count(*) > 1) and rowid not in (select min(rowid) from Proxy_Main group by ip,port having count(*)>1)
error info:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near “,”: syntax error [SQL: ‘delete from Proxy_Main where (Proxy_Main.ip,Proxy_Main.port) in (select ip,port from Proxy_Main group by ip,port having count() > 1) and rowid not in (select min(rowid) from Proxy_Main group by ip,port having count()>1)’] (Background on this error at: http://sqlalche.me/e/e3q8)
@staticmethod def execute(sql): conn = engine.connect() conn.execute(sql) conn.close() @staticmethod def deduplication(): SqlHelper.execute('delete from Proxy_Main where (Proxy_Main.ip,Proxy_Main.port) in (select ip,port from Proxy_Main group by ip,port having count(*) > 1) and rowid not in (select min(rowid) from Proxy_Main group by ip,port having count(*)>1)')
Advertisement
Answer
Row value comparison as being attempted in IN
clause is only available in SQLite 3.15+. As mentioned at very bottom of linked docs page:
Row values were added to SQLite version 3.15.0 (2016-10-14). Attempts to use row values in prior versions of SQLite will generate syntax errors.
Check your version (SELECT sqlite_version();
). Upgrade as needed since your query does work with an updated version (run the example SQL Fiddle).
Alternatively, consider a more ANSI-SQL solution (i.e., portable across RDBMS’s) using an aggregate join subquery:
DELETE FROM Proxy_Main WHERE rowid IN (SELECT p.rowid FROM Proxy_Main p INNER JOIN (SELECT ip, port, MIN(rowid) As min_id FROM Proxy_Main GROUP BY ip, port HAVING COUNT(*) > 1) AS agg ON p.ip = agg.ip AND p.port = agg.port AND p.rowid <> agg.min_id);
Fiddle Demo (press Run at top)
Do note you can pass multi-line queries in Python with triple quote strings.
@staticmethod def deduplication(): sql = """DELETE FROM Proxy_Main WHERE rowid IN (SELECT p.rowid FROM Proxy_Main p INNER JOIN (SELECT ip, port, MIN(rowid) As min_id FROM Proxy_Main GROUP BY ip, port HAVING COUNT(*) > 1) AS agg ON p.ip = agg.ip AND p.port = agg.port AND p.rowid <> agg.min_id);""" SqlHelper.execute(sql)