Django with MySQL: 'Subquery returns more than 1 row'

Using django with a MySQL DB and given these models:

ModelB   ---FK--->   ModelA
 - c_id
 
ModelC

I want to get all the ModelC for each ModelA via an annotation.

I tried many options looking at existing solutions but could not make it work. The following code works when there is just one ModelC for each ModelA but as soon as there is more than one, I get the Subquery returns more than 1 row error and I don't know how to get a list of the ModelC models instead. I tried to build a list of JSON objects of the ModelC without success.

qs = ModelA.objects.all()

c_ids = (
    ModelB.objects \
        .filter(modela_id=OuterRef(OuterRef('id'))) \
        .values('c_id')
)
all_c = (
    ModelC.objects \
        .filter(id__in=Subquery(c_ids)) \
        .values('id')
)

qs1 = qs.annotate(all_c=Subquery(all_c ))
for p in qs1:
    print(p, p.all_c)

ModelB looks like a junction table. Having an id pointing to A and C

Django supports junction tables.

But when it comes to annotation the objects with the list of ids, I'm not entire sure if that is possible purely by the ORM.

class ModelA(models.Model):
    model_c_objects = models.ManyToManyField("ModelC", through="ModelB") 

class ModelB(models.Model):
    model_a = models.ForeignKey(ModelA, on_delete=models.CASCADE)
    model_b = models.ForeignKey(ModelB, on_delete=models.CASCADE)

class ModelC(models.Model):
    ...


# This one here I have no idea if it would work or not
ModalA.objects.prefetch_related("models_c_objects").annotate(model_c_object_ids=ArrayAgg("model_c_objects__id")

# If it doesn't:
class ModelA(models.Model):
    model_c_objects = models.ManyToManyField("ModelC", through="ModelB") 
    
    @property
    def model_c_object_ids(self):
        return list(self.model_c_objects.values("id", flat=True))

# And you can then use it like you wished
for model_a_object in ModelA.objects.prefetch_related("models_c_objects"):
    model_a_object.model_c_object_ids # list of model_c ids like: [1,4,12,63]

I'm feeling a bit lazy but either of the two solutions should work and they both use a single query.

I come from the assumption that Model B is indeed a through table for M2M relationship between Model A and Model C as Işık Kaplan suggested.

In Postgres you could use ArrayAgg like Işık Kaplan suggested. Equivalent in MySQL in GROUP_CONCAT but it is not present in the ORM out of the box. Also from personal experience I wouldn't recommend it as it performed terribly in my use case.

What I ended up doing was combining 2 queries using Python which was way faster then 1 complicated query with GROUP_CONCAT (around 60K records of "Model A" and 20K of "Model B" in my case). In your case it would look like this:

a_qs = ModelA.objects.all()
c_ids_dict = defaultdict(list)
c_ids = a_qs.values("id", "models_c_objects__id")
for item in c_ids:
    if item["models_c_objects__id"]:
        c_ids_dict[item["id"]].append(item["models_c_objects__id"])
for p in a_qs:
    print(p, c_ids_dict.get(p.id, []))

The following should do

from django.db.models.aggregates import Aggregate

class JSONArrayAgg(Aggregate):
   function = "JSON_ARRAYAGG"
   
ModelA.objects.annotate(
    all_c=Subquery(
        ModelB.objects.filter(
            ref_type="c",
            modela_id=OuterRef("id"),
        ).values(
            "modela_id"
        ).values_list(
            JSONArrayAgg("ref_id")
        )
    )
)

which translates to

SELECT
    model_a.*,
    (SELECT JSON_ARRAYAGG(model_b.ref_id)
     FROM model_b
     WHERE model_b.ref_type = "c" AND model_b.modela_id = model_a.id
     GROUP BY model_b.modela_id
    ) all_c
FROM model_a

But it would be much easier if you provided your exact model definition as it's likely only a matter of doing something along the lines of

ModelA.objects.annotate(
   all_c=JSONArrayAgg(
      "modelb_set__ref_id", filter=Q(modelb_set__ref_type="c")
   )
)

which translate to

SELECT
    model_a.*,
    JSON_ARRAYAGG(
       CASE WHEN model_b.ref_type = "c" THEN model_b.ref_id END
    )
FROM model_a
LEFT JOIN model_b ON (model_b.modela_id = model_a.id)
GROUP BY model_a.id
Back to Top