How to store CSV file in database?

Question

There is a output file from Python Pandas with a lot of columns with headers. I need to be able handle this file by script and get CSV files in different columns positions. For example, initial file has columns As variation I need to get it in different sequence: I wonder what is the best way to store this file

Accepted Answer

When you need to load data into the table from CSV file in which the column ordering is not definite then you may use the code similar to below one:SET @name1 = NULL;SET @name2 = NULL;SET @name3 = NULL;LOAD DATA INFILE 'file_path_name'INTO TABLE table_name{loading options, but no line ignoring}(@column1, @column2, @column3)SET    @name1 = COALESCE(@name1, @column1),    @name2 = COALESCE(@name2, @column2),    @name3 = COALESCE(@name3, @column3),    Name = CASE WHEN (@name1, @name2, @name3) = (@column1,@column2,@column3) THEN NULL                WHEN @name1 = 'Name' Then @column1                WHEN @name2 = 'Name' Then @column2                WHEN @name3 = 'Name' Then @column3                ELSE NULL                END,    Age = CASE WHEN (@name1, @name2, @name3) = (@column1,@column2,@column3) THEN NULL               WHEN @name1 = 'Age' Then @column1               WHEN @name2 = 'Age' Then @column2               WHEN @name3 = 'Age' Then @column3               ELSE NULL               END,    Country = CASE WHEN (@name1, @name2, @name3) = (@column1,@column2,@column3) THEN NULL                   WHEN @name1 = 'Country' Then @column1                   WHEN @name2 = 'Country' Then @column2                   WHEN @name3 = 'Country' Then @column3                   ELSE NULL                   END;DELETE FROM table_name WHERE Name IS NULL AND Age IS NULL AND Country IS NULL;                   Three variables are cleared.Then the file is loaded and parsed with input preprocessing line by line.First line of CSV contains columns names. First 3 assiginings stores the columns ordering in the variables. For all next rows these value won&#8217;t be altered.Then we fill the table columns values by selecting the value which matches the column from correct variable with according CASEs. The most first condition in each CASE will fire again for the first line only. We cannot skip line in SET clause but we can set the values to some incorrect predefined values for future detection. These values shouldn&#8217;t be NULL, for example, you may set Name to &#8216;This row must be deleted&#8217; and/or Age to 999. Also you may use not all but only some columns values for the detection (but you must assign correct values to NOT NULL columns, so this condition can be removed for the CASE of the column only when the column is nullable).After all lines loaded we delete excess line inserted while the header line is loaded.

Advertisement

Answer