I am using SQLAlchemy to read data from a SQL Server database and then turning the table data to a csv file to later hand off.
However, I noticed when there is a 1 or 0 in a SQL Server table field, the csv output has True or False instead. I know to Python, it’s still a number since True == 1
returns True
Is it possible to not have this happen from the get go and have the csv keep 1 or 0?
This my current code:
import pandas as pd from sqlalchemy import create_engine pd.set_option('display.max_columns', None) pd.set_option('max_colwidth', None) df = pd.read_csv(r'''C:UsersusernameDownloadspy_tma_tables.csv''') tma_table = df['table_name'].tolist() servername = 'SERVER' dbname = 'DATABASE' sqlcon = create_engine('mssql+pyodbc://@' + servername + '/' + dbname + '?driver=ODBC+Driver+17+for+SQL+Server') df_list = [] count = 0 while count < 1: df1 = pd.read_sql_query("SELECT * FROM " + tma_table[count], sqlcon) df_list.append(df1) df_list[count].to_csv(tma_table[count] + ".csv", index=False, header=None, encoding='utf-8') count += 1
I have about 450 tables that this would be applied to so single table solutions aren’t going to work as I need an automated way to have every table follow this rule I’m aiming for.
I started to go down the route of trying to loop through each column and change the dtype of the column but it seems easier to not have to replace True or False to 1 or 0 in the first place.
dtypes output:
cst_pk int64 cst_code object cst_name object cst_clnt_fk int64 cst_active bool cst_encumbered object cst_purgeDate object cst_splitBill bool cst_subLabor_fk object cst_subParts_fk float64 cst_subOther_fk object cst_subContract bool cst_subContractLabor_fk object cst_subContractParts_fk object cst_subContractOther_fk object cst_balanceType object cst_normalBalance object cst_ay_fk int64 cst_header bool cst_beginningBalance object cst_alias object cst_modifier_fk float64 cst_modifiedDate datetime64[ns] cst_creator_fk float64 cst_createddate datetime64[ns] cst_curr_fk object cst_exch_fk object cst_exch_date object cst_ag_fk object cst_dp_fk float64 cst_alternateAccount object dtype: object
SQL CREATE TABLE query:
CREATE TABLE [dbo].[f_account]( [cst_pk] [int] NOT NULL, [cst_code] [nvarchar](100) NOT NULL, [cst_name] [nvarchar](35) NOT NULL, [cst_clnt_fk] [int] NOT NULL, [cst_active] [bit] NOT NULL, [cst_encumbered] [decimal](10, 2) NULL, [cst_purgeDate] [datetime] NULL, [cst_splitBill] [bit] NOT NULL, [cst_subLabor_fk] [int] NULL, [cst_subParts_fk] [int] NULL, [cst_subOther_fk] [int] NULL, [cst_subContract] [bit] NOT NULL, [cst_subContractLabor_fk] [int] NULL, [cst_subContractParts_fk] [int] NULL, [cst_subContractOther_fk] [int] NULL, [cst_balanceType] [nvarchar](20) NULL, [cst_normalBalance] [nvarchar](20) NULL, [cst_ay_fk] [int] NULL, [cst_header] [bit] NOT NULL, [cst_beginningBalance] [decimal](10, 2) NULL, [cst_alias] [nvarchar](32) NULL, [cst_modifier_fk] [int] NULL, [cst_modifiedDate] [datetime] NULL, [cst_creator_fk] [int] NULL, [cst_createddate] [datetime] NULL, [cst_curr_fk] [int] NULL, [cst_exch_fk] [int] NULL, [cst_exch_date] [datetime] NULL, [cst_ag_fk] [int] NULL, [cst_dp_fk] [int] NULL, [cst_alternateAccount] [nvarchar](100) NULL
Advertisement
Answer
As explained in
https://github.com/mkleehammer/pyodbc/issues/383
pyodbc returns bit
columns as boolean values because that is what most people use bit
columns for.
import pandas as pd import sqlalchemy as sa engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199") table_data = """ SELECT -1 AS id, CAST(NULL AS bit) AS bit_col UNION ALL SELECT 0 AS id, CAST(0 AS bit) AS bit_col UNION ALL SELECT 1 AS id, CAST(1 AS bit) AS bit_col """ df = pd.read_sql_query(table_data, engine) print(df) """ id bit_col 0 -1 None 1 0 False 2 1 True """
If you want pyodbc to return bit
columns as some other type you can use an output converter function as illustrated in the GitHub issue. The trick is getting SQLAlchemy to use it. That is done using an event listener.
import pandas as pd import pyodbc import sqlalchemy as sa engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199") table_data = """ SELECT -1 AS id, CAST(NULL AS bit) AS bit_col UNION ALL SELECT 0 AS id, CAST(0 AS bit) AS bit_col UNION ALL SELECT 1 AS id, CAST(1 AS bit) AS bit_col """ def handle_bit_type(bit_value): return bit_value @sa.event.listens_for(engine, "connect") def connect(conn, rec): conn.add_output_converter(pyodbc.SQL_BIT, handle_bit_type) df = pd.read_sql_query(table_data, engine) print(df) """ id bit_col 0 -1 None 1 0 b'x00' 2 1 b'x01' """
Edit: Or, if you use
def handle_bit_type(bit_value): if bit_value is None: rtn = None elif bit_value == b"x00": rtn = "0" else: rtn = "1" return rtn
you’ll get
df = pd.read_sql_query(table_data, engine) print(df) """ id bit_col 0 -1 None 1 0 0 2 1 1 """