Tag: sql

Pyspark: How to flatten nested arrays by merging values in spark

apache-spark apache-spark-sql pyspark python sql

I have 10000 jsons with different ids each has 10000 names. How to flatten nested arrays by merging values by int or str in pyspark? EDIT: I have added column name_10000_xvz to explain better data structure. I have updated Notes, Input df, required output df and input json files as well. Notes: Input dataframe has more than 10000 columns name_1_a,

How to clean data so that the correct arrival code is there for the city pair?

dataframe pandas python sql

How to clean data so that the correct arrival code is there for the city pair? From the picture, the CSV is like column 1: City Pair (Departure – Arrival), column 2 is meant to be the Departure Code, and column 3 is meant to be the Arrival Code. As you can see for row 319 in the first column,

How to create new table with first name only in table

apache-spark-sql mysql python sql

I have some data that looks like this: I’d like to create a new table with the name column but with the first name only. Answer This gets the first substring before the space character in name as first_name. first_name Arizona Emerald

DataFrame comparison with SQL Server table and upload just the differences

dataframe pandas python sql sql-server

I have an SQL table (table_1) that contains data, and I have a Python script that reads a csf and creates a dataframe. I want to compare the dataframe with the SQL table data and then insert the missing data from the dataframe into the SQL table. I went around and read this comparing pandas dataframe with sqlite table via

Iterating SQL query inside python loop and changing the value of date function in SQL query with every loop

python sql vertica vertica-python

I have a SQL query which I want to iterate using python for loop. Is there a way where I can define a variable inside the sql query and update it’s value with each python loop? date1 = datetime.date(2017, 1, 1) date2 = datetime.date(2017, 12, 31) for d in daterange(date1, date2): SQL = “SELECT * FROM table WHERE TABLE.CREATED_AT =

Snowflake table created with SQLAlchemy requires quotes (“”) to query

pandas python snowflake-cloud-data-platform sql sqlalchemy

I am ingesting data into Snowflake tables using Python and SQLAlchemy. These tables that I have created all require quotations to query both the table name and the column names. For example, select * from “database”.”schema”.”table” where “column” = 2; Will run, while select * from database.schema.table where column = 2; will not run. The difference being the quotes. I

Python and SQL: Getting rows from csv results in ERROR: “There are more columns in the INSERT statement than values specified in the VALUES clause.”

pyodbc python sql

I have a csv file with several records that I am trying to import into a SQL table via a Python script. My csv file (now reduced to) just one row of 1s. Here is what I am trying to do (after successfully connecting to the database etc etc…): No matter how I format the data in the csv (right

Python – Store cryptography keys in SQL database

mariadb python python-3.x sql

Working on a “Password Saver” and will be using the module “cryptography” to encrypt the passwords. I need to save the key you generate from cryptography in the database as well, but I am not sure how you actually do this. Done some google searches myself and it seems to be called a “byte string”? Not really sure what it

Import data from python (probleme with where condition)

database import python sql where-clause

I work in Python I have code that allows me to import a dataset that works fine. However in my dataset I have 3 different patients and I would like to import only the patient that interests me (possible by adding the WHERE statement in the SQL query. So the following code works: It return the patient 14 data But