Conditional aggregate and distinct

Using StringAgg with the distinct=True argument works under normal circumstances, e.g.:

entities = entities.annotate(roles=StringAgg(
    "credits__role__name",
    delimiter=", ",
    distinct=True,
    ordering="credits__role__name"
))

But when used with a conditional expression, it throws the exception django.db.utils.ProgrammingError: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list, e.g.:

releases = releases.annotate(credits_primary=StringAgg(
    Case(When(credits__type="primary", then="credits__entity__name")),
    delimiter=", ",
    distinct=True,
    ordering="credits__entity__name"
))

Why is this, and is there a way to make the second example work?

The error occurs because PostgreSQL requires that ORDER BY columns appear in DISTINCT.

ensure that both the ORDER BY expression and the DISTINCT argument are aligned. You can use Filter to pre-filter before aggregation, like this:

from django.db.models import StringAgg, Case, When, Value
from django.db.models.functions import Cast

releases = releases.annotate(
    credits_primary=StringAgg(
        Cast(Case(When(credits__type="primary", then="credits__entity__name"), default=Value("")), output_field=models.TextField()),
        delimiter=", ",
        distinct=True,
        ordering="credits__entity__name"
    )
)

Alternative: Use Filter Instead of Case If your Case expression is acting as a filter, another approach is to use Filter directly in the aggregation:

from django.db.models import Q, F

releases = releases.annotate(
    credits_primary=StringAgg(
        F("credits__entity__name"),
        delimiter=", ",
        distinct=True,
        ordering="credits__entity__name"
    ).filter(credits__type="primary")
)

Tell me if it helped.

Try the following query, it should give the expected results:

from django.db import models  
from django.contrib.postgres.aggregates import StringAgg  
  
releases.annotate(  
    credits_primary=StringAgg(  
        'credits__entity__name',  
        #  or `models.Q(credits__type='secondary')` in elif condition  
        filter=models.Q(credits__type='primary'),  
        delimiter=', ',  
        distinct=True,  
        ordering='credits__entity__name',  
    ),  
)

You should use filter inside StringAgg, I didn't find such an example in the documentation for StringAgg, but you can see it here by looking at the signature. On this page you can see the use of filter in the Count aggregation function.

...
from django.db.models import Q
above_5 = Count("book", filter=Q(book__rating__gt=5))
...

Edit: see this answer for a more graceful and readable solution.


For reasons I frankly don't understand at all, it appears to sort by credits__entity__name even when ordering="credits__entity__name" is omitted. Perhaps by default it orders alphabetically, or by the field whose value is passed in the conditional expression. The Django docs don't specify the default behavior. (note that the Credit model's default ordering is not by entity)

In other words, this produces the desired result:

releases = releases.annotate(credits_primary=StringAgg(
    Case(When(credits__type="primary", then="credits__entity__name")),
    delimiter=", ",
    distinct=True,
))

Hopefully someone with a deeper understanding of Django and/or PostgreSQL can produce an answer that explains why this is the case.

Вернуться на верх