Import mssql spatial fields into geopandas/shapely geometry

Question

I cannot seem to be able to directly import mssql spatial fields into geopandas. I can import normal mssql tables into pandas with Pymssql without problems, but I cannot figure out a way to import the spatial fields into shapely geometry. I know that the OGR driver for mssql should be able to handle it, but I'm not skilled enough

Accepted Answer

I figured it out by properly querying the sql database table and converting the wkt string to shapely geometry via the loads function in shapely.wkt.I&#8217;m no programmer, so bear that in mind with the organization of the function. The function can import mssql tables with or without GIS geometry.from pymssql import connectfrom pandas import read_sqlfrom shapely.wkt import loadsfrom geopandas import GeoDataFramedef rd_sql(server, database, table, col_names=None, where_col=None, where_val=None, geo_col=False, epsg=2193, export=False, path='save.csv'):    """    Imports data from MSSQL database, returns GeoDataFrame. Specific columns can be selected and specific queries within columns can be selected. Requires the pymssql package, which must be separately installed.    Arguments:    server -- The server name (str). e.g.: 'SQL2012PROD03'    database -- The specific database within the server (str). e.g.: 'LowFlows'    table -- The specific table within the database (str). e.g.: 'LowFlowSiteRestrictionDaily'    col_names -- The column names that should be retrieved (list). e.g.: ['SiteID', 'BandNo', 'RecordNo']    where_col -- The sql statement related to a specific column for selection (must be formated according to the example). e.g.: 'SnapshotType'    where_val -- The WHERE query values for the where_col (list). e.g. ['value1', 'value2']    geo_col -- Is there a geometry column in the table?    epsg -- The coordinate system (int)    export -- Should the data be exported    path -- The path and csv name for the export if 'export' is True (str)    """    if col_names is None and where_col is None:        stmt1 = 'SELECT * FROM ' + table    elif where_col is None:        stmt1 = 'SELECT ' + str(col_names).replace(''', '"')[1:-1] + ' FROM ' + table    else:        stmt1 = 'SELECT ' + str(col_names).replace(''', '"')[1:-1] + ' FROM ' + table + ' WHERE ' + str([where_col]).replace(''', '"')[1:-1] + ' IN (' + str(where_val)[1:-1] + ')'    conn = connect(server, database=database)    df = read_sql(stmt1, conn)    ## Read in geometry if required    if geo_col:        geo_col_stmt = "SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=" + "'" + table + "'" + " AND DATA_TYPE='geometry'"        geo_col = str(read_sql(geo_col_stmt, conn).iloc[0,0])        if where_col is None:            stmt2 = 'SELECT ' + geo_col + '.STGeometryN(1).ToString()' + ' FROM ' + table        else:            stmt2 = 'SELECT ' + geo_col + '.STGeometryN(1).ToString()' + ' FROM ' + table + ' WHERE ' + str([where_col]).replace(''', '"')[1:-1] + ' IN (' + str(where_val)[1:-1] + ')'        df2 = read_sql(stmt2, conn)        df2.columns = ['geometry']        geometry = [loads(x) for x in df2.geometry]        df = GeoDataFrame(df, geometry=geometry, crs={'init' :'epsg:' + str(epsg)})    if export:        df.to_csv(path, index=False)    conn.close()    return(df)EDIT: Made the function automatically find the geometry field if one exists.

Advertisement

Answer