How to upsert pandas DataFrame to PostgreSQL table?

Tags: , , , ,



I’ve scraped some data from web sources and stored it all in a pandas DataFrame. Now, in order harness the powerful db tools afforded by SQLAlchemy, I want to convert said DataFrame into a Table() object and eventually upsert all data into a PostgreSQL table. If this is practical, what is a workable method of going about accomplishing this task?

Answer

If you are using PostgreSQL 9.5 or later you can perform the UPSERT using a temporary table and an INSERT ... ON CONFLICT statement:

with engine.begin() as conn:
    # step 0.0 - create test environment
    conn.execute(sa.text("DROP TABLE IF EXISTS main_table"))
    conn.execute(
        sa.text(
            "CREATE TABLE main_table (id int primary key, txt varchar(50))"
        )
    )
    conn.execute(
        sa.text(
            "INSERT INTO main_table (id, txt) VALUES (1, 'row 1 old text')"
        )
    )
    # step 0.1 - create DataFrame to UPSERT
    df = pd.DataFrame(
        [(2, "new row 2 text"), (1, "row 1 new text")], columns=["id", "txt"]
    )
    
    # step 1 - create temporary table and upload DataFrame
    conn.execute(
        sa.text(
            "CREATE TEMPORARY TABLE temp_table (id int primary key, txt varchar(50))"
        )
    )
    df.to_sql("temp_table", conn, index=False, if_exists="append")

    # step 2 - merge temp_table into main_table
    conn.execute(
        sa.text("""
            INSERT INTO main_table (id, txt) 
            SELECT id, txt FROM temp_table
            ON CONFLICT (id) DO
                UPDATE SET txt = EXCLUDED.txt
            """
        )
    )

    # step 3 - confirm results
    result = conn.execute(sa.text("SELECT * FROM main_table ORDER BY id")).fetchall()
    print(result)  # [(1, 'row 1 new text'), (2, 'new row 2 text')]


Source: stackoverflow