I am using SQLAlchemy to read data from a SQL Server database and then turning the table data to a csv file to later hand off.
However, I noticed when there is a 1 or 0 in a SQL Server table field, the csv output has True or False instead. I know to Python, it’s still a number since True == 1 returns True
Is it possible to not have this happen from the get go and have the csv keep 1 or 0?
This my current code:
import pandas as pd
from sqlalchemy import create_engine
pd.set_option('display.max_columns', None)
pd.set_option('max_colwidth', None)
df = pd.read_csv(r'''C:UsersusernameDownloadspy_tma_tables.csv''')
tma_table = df['table_name'].tolist()
servername = 'SERVER'
dbname = 'DATABASE'
sqlcon = create_engine('mssql+pyodbc://@' + servername + '/' + dbname + '?driver=ODBC+Driver+17+for+SQL+Server')
df_list = []
count = 0
while count < 1:
df1 = pd.read_sql_query("SELECT * FROM " + tma_table[count], sqlcon)
df_list.append(df1)
df_list[count].to_csv(tma_table[count] + ".csv", index=False, header=None, encoding='utf-8')
count += 1
I have about 450 tables that this would be applied to so single table solutions aren’t going to work as I need an automated way to have every table follow this rule I’m aiming for.
I started to go down the route of trying to loop through each column and change the dtype of the column but it seems easier to not have to replace True or False to 1 or 0 in the first place.
dtypes output:
cst_pk int64 cst_code object cst_name object cst_clnt_fk int64 cst_active bool cst_encumbered object cst_purgeDate object cst_splitBill bool cst_subLabor_fk object cst_subParts_fk float64 cst_subOther_fk object cst_subContract bool cst_subContractLabor_fk object cst_subContractParts_fk object cst_subContractOther_fk object cst_balanceType object cst_normalBalance object cst_ay_fk int64 cst_header bool cst_beginningBalance object cst_alias object cst_modifier_fk float64 cst_modifiedDate datetime64[ns] cst_creator_fk float64 cst_createddate datetime64[ns] cst_curr_fk object cst_exch_fk object cst_exch_date object cst_ag_fk object cst_dp_fk float64 cst_alternateAccount object dtype: object
SQL CREATE TABLE query:
CREATE TABLE [dbo].[f_account](
[cst_pk] [int] NOT NULL,
[cst_code] [nvarchar](100) NOT NULL,
[cst_name] [nvarchar](35) NOT NULL,
[cst_clnt_fk] [int] NOT NULL,
[cst_active] [bit] NOT NULL,
[cst_encumbered] [decimal](10, 2) NULL,
[cst_purgeDate] [datetime] NULL,
[cst_splitBill] [bit] NOT NULL,
[cst_subLabor_fk] [int] NULL,
[cst_subParts_fk] [int] NULL,
[cst_subOther_fk] [int] NULL,
[cst_subContract] [bit] NOT NULL,
[cst_subContractLabor_fk] [int] NULL,
[cst_subContractParts_fk] [int] NULL,
[cst_subContractOther_fk] [int] NULL,
[cst_balanceType] [nvarchar](20) NULL,
[cst_normalBalance] [nvarchar](20) NULL,
[cst_ay_fk] [int] NULL,
[cst_header] [bit] NOT NULL,
[cst_beginningBalance] [decimal](10, 2) NULL,
[cst_alias] [nvarchar](32) NULL,
[cst_modifier_fk] [int] NULL,
[cst_modifiedDate] [datetime] NULL,
[cst_creator_fk] [int] NULL,
[cst_createddate] [datetime] NULL,
[cst_curr_fk] [int] NULL,
[cst_exch_fk] [int] NULL,
[cst_exch_date] [datetime] NULL,
[cst_ag_fk] [int] NULL,
[cst_dp_fk] [int] NULL,
[cst_alternateAccount] [nvarchar](100) NULL
Advertisement
Answer
As explained in
https://github.com/mkleehammer/pyodbc/issues/383
pyodbc returns bit columns as boolean values because that is what most people use bit columns for.
import pandas as pd
import sqlalchemy as sa
engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 False
2 1 True
"""
If you want pyodbc to return bit columns as some other type you can use an output converter function as illustrated in the GitHub issue. The trick is getting SQLAlchemy to use it. That is done using an event listener.
import pandas as pd
import pyodbc
import sqlalchemy as sa
engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
def handle_bit_type(bit_value):
return bit_value
@sa.event.listens_for(engine, "connect")
def connect(conn, rec):
conn.add_output_converter(pyodbc.SQL_BIT, handle_bit_type)
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 b'x00'
2 1 b'x01'
"""
Edit: Or, if you use
def handle_bit_type(bit_value):
if bit_value is None:
rtn = None
elif bit_value == b"x00":
rtn = "0"
else:
rtn = "1"
return rtn
you’ll get
df = pd.read_sql_query(table_data, engine) print(df) """ id bit_col 0 -1 None 1 0 0 2 1 1 """