Django ORM efficient way for annotating True on first occurrences of field

I have a situation where we have some logic that sorts a patient_journey queryset (table) by some heuristics. A patient_journey has a FK to a patient.

I now need an efficient way of setting True for the first occurrence of a patient_journey for a given patient, false otherwise.

The first heuristinc is patient_id so the queryset will already by grouped by patient.

The algorithm is v straight forward & should be fast, yet I'm stuck at trying to get something <1s.

I've tried using distinct and checking for existence, however that's adding 1-2s.

I've tried using a subquery with [:1] & test on id, but that's even worse around 3-5s extra.

def annotate_primary(
*, qs: 'PatientJourneyQuerySet'  # noqa
) -> 'PatientJourneyQuerySet':  # noqa
    """
    Constraints:
    --------------
    Exactly One non-global primary journey per patient

    Annotations:
    ------------
    is_primary_:Bool, is the primary_journey for a patient
    """
    from patient_journey.models import PatientJourney
    qs = get_sorted_for_primary_journey_qs(qs=qs)
    # cost until now is around 0.17s
    # using distinct on patient_id & checking if id is in there adds around 1.5-2s
    # looking for something faster, i.e. this shoiuld be a straight forward scan.
    qs = qs.annotate(
        # True for first occurrence of a `patient_id` false otherwise
        primary=
    )
    return qs

Assuming the models are like this

from django.db import models

class Patient(models.Model):
    first_name = models.CharField(max_length=50)
    last_name = models.CharField(max_length=50)
  
class PatientJourney(models.Model):
    patient = models.ForeignKey(Patient, on_delete=models.CASCADE, related_name='journeys')
    visit_date = models.DateTimeField()
    diagnosis = models.TextField(blank=True, null=True)
    treatment = models.TextField(blank=True, null=True)
    

Using Window funstions you can get only the first journeys

from django.db.models import Window, F
from django.db.models.functions import RowNumber
from .models import PatientJourney

# Annotate each journey with its row number partitioned by patient and ordered by visit_date in ascendin order
patient_journeys_with_row_number = PatientJourney.objects.annotate(
    row_number=Window(
        expression=RowNumber(),
        partition_by=[F('patient')],
        order_by=F('visit_date').asc()
    )
).filter(row_number=1)

for journey in patient_journeys_with_row_number:
    print(f"{journey.patient.first_name} {journey.patient.last_name} - First Journey on {journey.visit_date}")
Back to Top