Conditional aggregate and distinct
Using StringAgg with the distinct=True
argument works under normal circumstances, e.g.:
entities = entities.annotate(roles=StringAgg(
"credits__role__name",
delimiter=", ",
distinct=True,
ordering="credits__role__name"
))
But when used with a conditional expression
, it throws the exception django.db.utils.ProgrammingError: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
, e.g.:
releases = releases.annotate(credits_primary=StringAgg(
Case(When(credits__type="primary", then="credits__entity__name")),
delimiter=", ",
distinct=True,
ordering="credits__entity__name"
))
Why is this, and is there a way to make the second example work?
The error occurs because PostgreSQL requires that ORDER BY
columns appear in DISTINCT
.
ensure that both the ORDER BY
expression and the DISTINCT
argument are aligned. You can use Filter
to pre-filter before aggregation, like this:
from django.db.models import StringAgg, Case, When, Value
from django.db.models.functions import Cast
releases = releases.annotate(
credits_primary=StringAgg(
Cast(Case(When(credits__type="primary", then="credits__entity__name"), default=Value("")), output_field=models.TextField()),
delimiter=", ",
distinct=True,
ordering="credits__entity__name"
)
)
Alternative: Use Filter
Instead of Case
If your Case
expression is acting as a filter
, another approach is to use Filter directly in the aggregation:
from django.db.models import Q, F
releases = releases.annotate(
credits_primary=StringAgg(
F("credits__entity__name"),
delimiter=", ",
distinct=True,
ordering="credits__entity__name"
).filter(credits__type="primary")
)
Tell me if it helped.
Try the following query, it should give the expected results:
from django.db import models
from django.contrib.postgres.aggregates import StringAgg
releases.annotate(
credits_primary=StringAgg(
'credits__entity__name',
# or `models.Q(credits__type='secondary')` in elif condition
filter=models.Q(credits__type='primary'),
delimiter=', ',
distinct=True,
ordering='credits__entity__name',
),
)
You should use filter
inside StringAgg
, I didn't find such an example in the documentation for StringAgg
, but you can see it here by looking at the signature. On this page you can see the use of filter
in the Count
aggregation function.
...
from django.db.models import Q
above_5 = Count("book", filter=Q(book__rating__gt=5))
...
Edit: see this answer for a more graceful and readable solution.
For reasons I frankly don't understand at all, it appears to sort by credits__entity__name
even when ordering="credits__entity__name"
is omitted. Perhaps by default it orders alphabetically, or by the field whose value is passed in the conditional expression
. The Django docs don't specify the default behavior.
(note that the Credit
model's default ordering is not by entity)
In other words, this produces the desired result:
releases = releases.annotate(credits_primary=StringAgg(
Case(When(credits__type="primary", then="credits__entity__name")),
delimiter=", ",
distinct=True,
))
Hopefully someone with a deeper understanding of Django and/or PostgreSQL can produce an answer that explains why this is the case.