Skip to content
Advertisement

INSERT INTO SELECT based on a dataframe

I have a dataframe df and I want to to execute a query to insert into a table all the values from the dataframe. Basically I am trying to load as the following query:

INSERT INTO mytable
SELECT *
FROM mydataframe

For that I have the following code:

import pyodbc
import pandas as pd

connection = pyodbc.connect('Driver={' + driver + '} ;'
                            'Server=' + server + ';'
                            'UID=' + user + ';'
                            'PWD=' + pass + ';')

cursor = connection.cursor()

query = 'SELECT * FROM [myDB].[dbo].[myTable]'
df = pd.read_sql_query(query, connection)

sql = 'INSERT INTO [dbo].[new_date] SELECT * FROM :x'
cursor.execute(sql, x=df)
connection.commit()

However, I am getting the following error:

TypeError: execute() takes no keyword arguments

Does anyone know what I am doing wrong?

Advertisement

Answer

See below my favourite solution, with UPSERT statement included.

df_columns = list(df)
columns = ','.join(df_columns)
values = 'VALUES({})'.format(','.join(['%s' for col in df_columns]))
update_list = ['{} = EXCLUDED.{}'.format(col, col) for col in df_columns]
update_str = ','.join(update_list)
insert_stmt = "INSERT INTO {} ({}) {} ON CONFLICT ([your_pkey_here]) DO UPDATE SET {}".format(table, columns, values, update_str)
Advertisement