Django `bulk_update()` : Update Different Fields for Each Record in a Single Query
I have two model instances, rec1 and rec2, but each has different fields updated:
rec1 = MyModel(id=1, name="John") # Only 'name' is changed
rec2 = MyModel(id=2, age=30) # Only 'age' is changed
Now, I want to update both records efficiently using bulk_update(). However, bulk_update() requires a fixed list of fields, meaning:
updated_fields = ['name', 'age']
model.objects.bulk_update([rec1, rec2], fields=updated_fields)
This updates both fields in both records, even though:
- rec1 only needs to update name
- rec2 only needs to update age
Goal: Update only the changed fields for each record in a single query. Avoid unnecessary updates to unchanged fields.
Issue: bulk_update() applies the same field list to all records, leading to redundant updates.
Question: Is there a way to update only the modified fields per record efficiently in a single query? Or is there an alternative approach to handle this in Django ORM?
model.objects.bulk_update(instances_to_update, fields=updated_fields)
where updated_fields is a list.
i was expecting updated_fields as a iteration type function.
There is no way to do that with a single SQL statement - and therefore the ORM also requires separate calls. Updating the same fields leverage on the execute_many functionality which is implemented by all Python DB connectors - and that call has one SQL statement "body" which is the same that is executed with changing parameters.
The SQL UPDATE statement for changing different fields has to change, so there is no bulk-update possible.
Just use a regular Python for loop, and call one update for each of your records. If you do that inside the same transaction, the difference towards calling bulk_update should be negligible.
You can do this, in fact the Django ORM will, for .bulk_update(..) essentially make a query that looks like:
from django.db.models import Case, Value, When
model.objects.filter(pk__in=[rec1.pk, rec2.pk]).update(
name=Case(
When(pk=rec1.pk, then=Value(rec1.name)),
When(pk=rec2.pk, then=Value(rec2.name)),
default=F('name'),
),
age=Case(
When(pk=rec1.pk, then=Value(rec1.age)),
When(pk=rec2.pk, then=Value(rec2.age)),
default=F('age'),
),
)
This is not very efficient: most databases will fall back to a "linear" search through the CASE-WHENs, and thus updating the records might be very slow.
so we can limit the cases to only the ones we want to update with:
model.objects.filter(pk__in=[rec1.pk, rec2.pk]).update(
name=Case(When(pk=rec1.pk, then=Value(rec1.name)), default=F('name')),
age=Case(When(pk=rec2.pk, then=Value(rec2.age)), default=F('age')),
)
TLDR;
We want to update different fields for different records in a single bulk_update() operation. This is indeed a limitation with Django's current implementation.
The key issue is that Django's bulk_update() method requires you to specify a fixed list of fields that will be applied to ALL objects in your update batch. There's no built-in way to say "only update the 'name' field for record 1, but only update the 'age' field for record 2" in a single operation.
When we use something like
rec1 = MyModel(id=1, name="John") # Only 'name' is changed
rec2 = MyModel(id=2, age=30) # Only 'age' is changed
model.objects.bulk_update([rec1, rec2], fields=['name', 'age'])
Django will generate SQL that updates both fields for both records, which isn't what you want.
It looks something like
UPDATE my_table SET
name = CASE WHEN id=1 THEN 'John' WHEN id=2 THEN name END,
age = CASE WHEN id=1 THEN age WHEN id=2 THEN 30 END
WHERE id IN (1,2)
It's updating both fields for both records, even though you only wanted to update one field per record. This is the inherent limitation.
As of now, there are several approaches to handle this scenario:
Group objects by the fields that need updating and perform separate bulk_update() calls:
name_updates = [obj for obj in all_objects if obj.name_changed]
age_updates = [obj for obj in all_objects if obj.age_changed]
MyModel.objects.bulk_update(name_updates, ['name'])
MyModel.objects.bulk_update(age_updates, ['age'])
Unfortunately, Django just doesn't have built-in support for this use case. According to multiple discussions, including those on Reddit, "Django and most ORMs don't bother with this. They either use all fields in the query or just the ones you specify"
There have been one discussion Reduce the "Case-When" sequence for a bulk_update when the values for a certain field are the same. an year ago about the same. You are free to use this approach with a custom manager/queryset.
I think the most efficient way to update the objects is to use bulk_update twice. Since the sets in your example are disjoint, each object will be updated only once.
Running a single bulk_update would require an if condition to handle different cases, which introduces additional processing overhead
.