Django `squashmigrations` leaving lots of RemoveField
I am trying to use Django manage.py squashmigrations
to reduce the number of migration files and migration operations, after years of accumulation.
Django should supposedly optimize the migrations and it does so to some extent.
Other cases, that should be obvious optimization are missed though, for example below where a simple AddField
+RemoveField
field is not. Only the AddField
gets inlined into the CreateModel
, but then the RemoveField
still remains, instead of completely leaving out the field from CreateModel
.
Ideally, there shouldn't be a single RemoveField
left after a squash, if I'm not mistaken.
Migration 1: Add the model
class Migration(migrations.Migration):
dependencies = [("data", "0104")]
operations = [
migrations.CreateModel(
name="MyModel",
fields=[... ]
)
]
Migration 2: Add a field to the model
class Migration(migrations.Migration):
dependencies = [("data", "0107")]
operations = [
migrations.AddField(
model_name="mymodel",
name="process_name",
field=models.CharField(default=None, max_length=300, null=True),
),
]
Migration 3: Remove the same field from the model
class Migration(migrations.Migration):
dependencies = [("data", "0121")]
operations = [migrations.RemoveField(model_name="mymodel", name="process_name")]
Resulting squashed migration:
class Migration(migrations.Migration):
replaces = [...]
initial = True
operations = [
migrations.CreateModel(
name="MyModel",
fields=[
(
"process_name", # Should be possible to leave out ???
models.CharField(default=None, max_length=300, null=True),
),
],
),
migrations.RemoveField(model_name="mymodel", name="process_name"), # ???
]
Is this expected or what could cause this?
I am not using any advanced migration features such as RunSql
or RunPython
. There are no other tables or migrations that depend to the process_name
field.
Are there any way to give hints to squashmigrations
? Can i run squash again to optimize further?
(Django version 5.1)
I guess it is due to how their optimizer is implemented.
Your migration has three operations
- CreateModel
- AddField
- RemoveField
When the optimizer is run
for i, operation in enumerate(operations): right = True # Should we reduce on the right or on the left. # Compare it to each operation after it for j, other in enumerate(operations[i + 1 :]): result = operation.reduce(other, app_label) ....
The iteration starts with i=0, operation=CreateModel
and looks for anything to .reduce
j=0, other=AddField
is reducible so it got rolled into a single operation
if isinstance(operation, AddField): return [ CreateModel( self.name, fields=self.fields + [(operation.name, operation.field)], options=self.options, bases=self.bases, managers=self.managers, ), ]
During i=0
iteration, j=1, other=RemoveField
is not reducible because the field process_name
didn't exist in the i=0, operation=CreateModel
operation.
During second iteration, i=1, operation=AddField
does look optimizible with RemoveField
, but I guess the optimized operation were not chosen by the optimizer
if isinstance(result, list): in_between = operations[i + 1 : i + j + 1] if right: new_operations.extend(in_between) new_operations.extend(result) elif all(op.reduce(other, app_label) is True for op in in_between): # Perform a left reduction if all of the in-between # operations can optimize through other. new_operations.extend(result) new_operations.extend(in_between) else: # Otherwise keep trying. new_operations.append(operation) break new_operations.extend(operations[i + j + 2 :]) return new_operations
I'm only taking a cursory look at the source code so am very likely interpreting things wrongly.
If your migration history is simple and all machine (production/dev/etc) is under your control. It might suffice to
- first reverse your migration to the one before the squashing
migrate data 0103
- delete the migrations you want to squash (0107 to 0121) and generate a new set by
makemigrations data
again. - migrate normally
or with a bit of manual work
squashmigrations data 0107 0121
first- transition the squashed migration in to normal migration (doc), meaning
- remove the
replace
attribute - update the
dependencies
attribute from other migration files to not use the original unsquashed migration (0107 to 0121) files - delete the replaced original migrations
- remove the
squashmigrations data 0104 0107_squash_0121
again
Depends on your motivation, the whole manual wrangling of the migration files might not be worth it as other comment pointed out
There are other concerns in squashing migration that might not be obvious. I wrote a note here