Skip to content
Advertisement

inserting into Postgres table with a foreign key using python

I am trying to insert data from a json into my postgres table with python (psycopg2). This json is huge and now I am questioning what the proper way is to insert all of this data.

What makes it difficult for me is that the table has a reference to another table. I want to refer to the id of that table based on one of the values in my json.

Details of my question:

CREATE TABLE market.entities (
    id serial PRIMARY KEY,
    code TEXT NOT null,
    name TEXT NOT NULL,
    country TEXT NOT null,
    exchange TEXT,
    currency_code TEXT NOT null,
    "type" TEXT,
    isin TEXT,
    api_source int REFERENCES market.source_api(id) NOT null,
    imported_at date NOT null
);

where this table entities is already filled with data like this:

id|code|name|country|exchange|currency_code|type|isin|api_source|imported_at
----------------------------------------------------------------------------
1|A|Agilent Technologies, Inc|USA|NYSE|USD|Common Stock|US00846U1016|1|2021-07-17
----------------------------------------------------------------------------

The table that I want to fill with the data from the json:

CREATE TABLE IF NOT EXISTS market.end_of_days(
    id serial PRIMARY KEY,
    entity_id int REFERENCES market.entities(id),
    import_type int REFERENCES market.import_types(id),
    source_api int REFERENCES market.source_api(id),
    date date,
    open int,
    high int,
    low int,
    close int,
    adjusted_close int,
    volume int,
    imported_at date,
    UNIQUE (date, source_api, entity_id)
    );

part of the json:

[
    {
        "code": "ONTRF",
        "exchange_short_name": "US",
        "date": "2021-07-08",
        "open": 0.1393,
        "high": 0.1393,
        "low": 0.1393,
        "close": 0.1393,
        "adjusted_close": 0.1393,
        "volume": 0
    },
    {
        "code": "ONTX",
        "exchange_short_name": "US",
        "date": "2021-07-08",
        "open": 5.72,
        "high": 6.19,
        "low": 5.7,
        "close": 6.07,
        "adjusted_close": 6.07,
        "volume": 324700
    }
]

The “code” key in the json is already located in the entities table, so in the end_of_days.entity_id I want to refer to the id from the entities table based on this “code” value.

It there a pretty solution for this with psycopg2? Normally I would just use psycopg2.extras.execute_values() but I think this won’t work in this specific situation.

(P.S. this is my first time posting on stackoverflow so if I need to specify something, or structure my question differently, please let me know.)

Advertisement

Answer

This is not really an answer but a suggestion for a possible way to handle this. While figuring this out break the JSON data into smaller chunks for testing purposes. The idea:

Create a staging table that you directly INSERT the JSON from the file(s) into jsonb fields in the table. Use the JSON capabilities of psycopg2. Then you could use the JSON processing functions from here JSON Function in ‘Table 9.47. JSON Processing Functions’ to modify and move the data to market.end_of_days

UPDATE

I found it easier to do something like:

CREATE TABLE symbol_import
    (code varchar, 
     exchange_short_name varchar, 
     "date" date, 
     open numeric, 
     high numeric, 
     low numeric, 
     close numeric, 
     adjusted_close numeric, 
     volume numeric);
import json
import psycopg2
from psycopg2.extras import execute_batch
json_val = """[
    {
        "code": "ONTRF",
        "exchange_short_name": "US",
        "date": "2021-07-08",
        "open": 0.1393,
        "high": 0.1393,
        "low": 0.1393,
        "close": 0.1393,
        "adjusted_close": 0.1393,
        "volume": 0
    },
    {
        "code": "ONTX",
        "exchange_short_name": "US",
        "date": "2021-07-08",
        "open": 5.72,
        "high": 6.19,
        "low": 5.7,
        "close": 6.07,
        "adjusted_close": 6.07,
        "volume": 324700
    }
]"""

data_dict = json.loads(json_val)
con = psycopg2.connect("...")
cur = con.cursor()
data_insert = "insert into symbol_import values(%(code)s, %(exchange_short_name)s, %(date)s, %(open)s, %(high)s, %(low)s, %(close)s, %(adjusted_close)s, %(volume)s)"

execute_batch(cur, data_insert, data_dict)
con.commit()
Advertisement