My situation is: having .csv file with some columns e.g. name, surname, age, activity, and others
I want to create objects in my relational
database with row
from this file being single object. Column names correspond to model fields and with others
being multiple columns defined by user who creates .csv (those others
land in model’s filed comment
separated with commas). Having relational db, base on model_1's
object I create model_2
and model_3
, all with the info from this .csv file.
My project is based on django rest framework
and react
.
Up till now my solution for this was to analize .csv file on with react using FileReader
and readAsArrayBuffer
which did it’s job most of the time.
With react
I not only have to analyze the file, but for every row make at least 3 x POSTing with axios. Sequentional posting is not ideal all the time.
My question is: > should files be analyzed on front- or back-end side ?
Heaving it done on the backend side with python
seems a lot easier, but there might be a better solution of which I can’t think of.
EDIT: The database I use is
postgresql
.
Csv file size could go up to a 6000 rows. With this many rows I could just completely ram up on posts from react. Also making a single post with file to handle the work to django could also result in a overtime for response from server.
EDIT2:
Example .csv file, might be a 6k rows long or more.
name,surname,activity,comment,Add_1,Add_2 franko1,romano2,asd1,smth1,1,2 franko2,romano3,asd2,smth2,2,3 franko3,romano4,asd3,smth3,3,4 franko4,romano5,asd4,smth4,4,5 franko5,romano6,asd5,smth5,5,6 franko6,romano7,asd6,smth6,6,7 franko7,romano8,asd7,smth7,7,8 franko8,romano9,asd8,smth8,8,9 franko9,romano10,asd9,smth9,9,10
Example models:
class Users(models.Model): name = models.CharField() surname = models.CharField() class Clients(models.Model): user = models.ForeignKey(Users) activity = models.CharField() class Returns(models.Model): client = models.ForeignKey(Clients) comment = models.CharField() <-- Add_1, Add_2 appended here
These models and .csv file are made up, as I am not allowed to provide those right ones, but dependencies between them are similar.
Users to Clients is OneToOne, Returns to Clients is ManyToOne relation. My .csv contain a info to make all of those models sequentially: Users, Clients and Returns. Unfortunetally I’m not able to cut it down even to two models.
I have found a great node package for .csv files loadup – papaparse, opensource, great documentation and unbelivable functionality!
With this package I end up with already .json
format for my upload.
Advertisement
Answer
I would use pandas, as it’s easiest to deal with CSV files quickly
import pandas as pd from my_app.models import Users,Clients,Returns def convert_csv_to_models(csv_abs_path): df = pd.read_csv(csv_abs_path) df['users'] = df.apply(lambda row: Users(name=row['name'], surname=row['surname']), axis=1) # axis = 1 means the apply will run per row and per columns df['clients'] = df.apply(lambda row: Clients(user=row['users'], activity=row['activity']), axis=1) df['returns'] = df.apply( lambda row: Returns(client=row['clients'], comment=f'{row.comment} {row.Add_1} {row.Add_2}'), axis=1) cols_to_models = {'users': Users, 'clients': Clients, 'returns': Returns} for model, col in cols_to_models.items(): # convert the column of objs to a list and then bulk create it objs = df[col].tolist() # ignore conflicts will ignore skip an object if it already exist model.objects.bulk_create(objs, ignore_conflicts=True)
I didn’t quite understand what You wanted to achieve with the Returns and comments column, so I joined the comment, Add_1, and Add_2 columns to one string as the comment, you can modify as you need.
Also, it’s not common practice to call Django Models in plural, you should do:
class User(models.Model): name = models.CharField() surname = models.CharField() class Client(models.Model): user = models.ForeignKey(Users) activity = models.CharField() class Return(models.Model): client = models.ForeignKey(Clients) comment = models.CharField()