Creating a single-column dataframe in Pandas from a Django queryset

Question

I'm trying to create a dataframe containing the values from field_1 and field_2 in a single column. I haven't used pandas a whole lot before, so I'm sure this is naive. If I'm working with a fairly large dataset, is there a way I can make this more efficient? I would like to eliminate the step that creates the CSV

Accepted Answer

You really don&#8217;t need:the order objects and getattr; use .values_list() to get an iterable of 2-tuples (assuming field_names are actual fields on the model).CSV – now that you have an iterable of 2-tuples, pass them to the DataFrame constructor along with the respective column names.field_names = ["description", "comments"]df = pd.DataFrame.from_records(  Order.objects.all().values_list(field_names),  columns=field_names,)Necessarily even Pandasfrom django.db.models import F, Valuefrom django.db.models.functions import Concat# ... my_data = list(    Order.objects.annotate(        x=Concat(            F("description"),            Value(", "),            F("comments"),        )    ).values_list("x", flat=True))and you have a list of description, comments strings, just like the series you&#8217;d get with Pandas.

Advertisement

Answer