Skip to content
Advertisement

Write Large Pandas DataFrames to SQL Server database

I have 74 relatively large Pandas DataFrames (About 34,600 rows and 8 columns) that I am trying to insert into a SQL Server database as quickly as possible. After doing some research, I learned that the good ole pandas.to_sql function is not good for such large inserts into a SQL Server database, which was the initial approach that I took (very slow – almost an hour for the application to complete vs about 4 minutes when using mysql database.)

This article, and many other StackOverflow posts have been helpful in pointing me in the right direction, however I’ve hit a roadblock:

I am trying to use SQLAlchemy’s Core rather than the ORM for reasons explained in the link above. So, I am converting the dataframe to a dictionary, using pandas.to_dict and then doing an execute() and insert():

JavaScript

The problem is that insert is not getting any values — they appear as a bunch of empty parenthesis and I get this error:

JavaScript

There are values in the list of dictionaries that I passed in, so I can’t figure out why the values aren’t showing up.

EDIT:

Here’s the example I’m going off of:

JavaScript

Advertisement

Answer

I’ve got some sad news for you, SQLAlchemy actually doesn’t implement bulk imports for SQL Server, it’s actually just going to do the same slow individual INSERT statements that to_sql is doing. I would say that your best bet is to try and script something up using the bcp command line tool. Here is a script that I’ve used in the past, but no guarantees:

JavaScript
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement