Django ORM gives duplicates in filtered queryset
I have a django app. I use the ORM to run some queries. It appears I have some duplicates in my result.
While I can simply add a distinct()
I would like to understand what is going on.
Here are my models:
class Person(models.Model):
created = models.DateTimeField(auto_now_add=True)
active_stuffs = models.ManyToManyField(Stuff, related_name="persons")
waiting_stuffs = models.ManyToManyField(Stuff, related_name="persons_waiting")
cancelled_stuffs = models.ManyToManyField(Stuff, related_name="persons_cancelled")
# ... other fields
class Stuff(models.Model):
name = models.CharField(null=False, blank=False, max_length=150,)
# ... other fields
Here is the query:
queryset = Person.objects.filter(
Q(active_stuffs__id=some_id)
| Q(cancelled_stuffs__id=some_id)
| Q(waiting_stuffs__id=some_id)
)
What I don't understand, is the following results:
queryset.count()
-> 23Person.objects.filter(Q(active_stuffs__id=some_id)).count()
-> 16Person.objects.filter(Q(cancelled_stuffs__id=some_id)).count()
-> 0Person.objects.filter(Q(waiting_stuffs__id=some_id)).count()
-> 6
An instance of Stuff
can only be in either active_stuffs
, cancelled_stuffs
or waiting_stuffs
. I checked the Person
instance that is duplicated, and the Stuff
instance I'm looking for is only in the waiting_stuffs
field... So, where could this duplicate come from?
The query makes LEFT OUTER JOIN
s, so:
SELECT person.*
FROM person
LEFT OUTER JOIN person_active_stuffs ON person_active_stuffs.person_id = person.id
LEFT OUTER JOIN person_waiting_stuffs ON person_waiting_stuffs.person_id = person.id
LEFT OUTER JOIN person_cancelled_stuffs ON person_cancelled_stuffs.person_id = person.id
WHERE person_active_stuffs.stuff_id IN some_ids
OR person_waiting_stuffs.stuff_id IN some_ids
OR person_cancelled_stuffs.stuff_id IN some_ids
so it makes three LEFT OUTER JOIN
s, and these can act as "multipliers" of each other.
If the same id occurs in active_stuffs
and waiting_stuffs
, it can be counted twice.