How to order queryset based on best match in django-rest-framework?

Tags: , ,



I am trying to order results of a query with parameters by number of matches.

For example, let’s say we have a Model:

class Template(models.Model):
    headline = CharField(max_length=300)
    text = TextField()
    image_text = TextField(max_length=500, blank=True, null=True)
    tags = TaggableManager(through=TaggedItem)
    ...

With a Serializer:

class TemplateSerializer(serializers.HyperlinkedModelSerializer):
    class Meta:
        model = Template
        fields = (...)

And a ViewSet:

class TemplateViewSet(viewsets.ModelViewSet):
    """
    API endpoint that allows Templates to be viewed or edited.
    """
    queryset = Template.objects.all()
    serializer_class = TemplateSerializer

    def get_queryset(self):
        queryset = Template.objects.all()

        tags = self.request.query_params.getlist('tags', None)
        search_text = self.request.query_params.getlist('search_text', None)

        if tags is not None:
            queries = [Q(groost_tags__name__iexact=tag) for tag in tags]
            query = queries.pop()
            for item in queries:
                query |= item

            queryset = queryset.filter(query).distinct()

        if search_tags is not None:
            queries = [Q(image_text__icontains=string) |
                       Q(text__icontains=string) |
                       Q(headline__icontains=string) for string in search_tags]
            query = queries.pop()
            for item in queries:
                query |= item

            queryset = queryset.filter(query).distinct()

What I need to do is count every match the filter finds and then order the queryset by that number of matches for each template. For example:

I want to find all the templates that have “hello” and “world” strings in their text, image_text or headline. So I set the query parameter “search_text” to hello,world. Template with headline=”World and text=”Hello, everyone.” would have 2 matches. Another one with headline=”Hello would have 1 match. The template with 2 matches would be the first in the queryset. The same behaviour should work for tags and tags with search_text combined.

I tried to calculate these numbers right in the ViewSet and then return a sorted(queryset, key=attrgetter(‘matches’)) but encountered several issues with the DRF, like Template has no attribute ‘matches’. Or 404 when directly accessing a Template instance through API.

Any ideas?

Answer

Give a try to annotation where each matching pair returns 1 or 0 that are summarized into rank:

from django.db.models import Avg, Case, F, FloatField, Value, When

Template.objects.annotate(
    k1=Case(
        When(image_text__icontains=string, then=Value(1.0)),
        default=Value(0.0),
        output_field=FloatField(),
    ),
    k2=Case(
        When(text__icontains=string, then=Value(1.0)),
        default=Value(0.0),
        output_field=FloatField(),
    ),
    k3=Case(
        When(headline__icontains=string, then=Value(1.0)),
        default=Value(0.0),
        output_field=FloatField(),
    ),
    rank=F("k1") + F("k2") + F("k3"),
).order_by("-rank")


Source: stackoverflow