Django `squashmigrations` leaving lots of RemoveField

I am trying to use Django manage.py squashmigrations to reduce the number of migration files and migration operations, after years of accumulation.

Django should supposedly optimize the migrations and it does so to some extent. Other cases, that should be obvious optimization are missed though, for example below where a simple AddField+RemoveField field is not. Only the AddField gets inlined into the CreateModel, but then the RemoveField still remains, instead of completely leaving out the field from CreateModel.

Ideally, there shouldn't be a single RemoveField left after a squash, if I'm not mistaken.

Migration 1: Add the model

class Migration(migrations.Migration):
    dependencies = [("data", "0104")]

    operations = [
        migrations.CreateModel(
                    name="MyModel",
                    fields=[... ]
        )
    ]

Migration 2: Add a field to the model

class Migration(migrations.Migration):
    dependencies = [("data", "0107")]

    operations = [
        migrations.AddField(
            model_name="mymodel",
            name="process_name",
            field=models.CharField(default=None, max_length=300, null=True),
        ),
    ]

Migration 3: Remove the same field from the model

class Migration(migrations.Migration):
    dependencies = [("data", "0121")]

    operations = [migrations.RemoveField(model_name="mymodel", name="process_name")]

Resulting squashed migration:

class Migration(migrations.Migration):
    replaces = [...]

    initial = True

    operations = [
        migrations.CreateModel(
            name="MyModel",
            fields=[
                (
                    "process_name",                  # Should be possible to leave out ???
                    models.CharField(default=None, max_length=300, null=True),
                ),
            ],
        ),
        migrations.RemoveField(model_name="mymodel", name="process_name"),           # ???
    ]

Is this expected or what could cause this?

I am not using any advanced migration features such as RunSql or RunPython. There are no other tables or migrations that depend to the process_name field.

Are there any way to give hints to squashmigrations? Can i run squash again to optimize further?

(Django version 5.1)

I guess it is due to how their optimizer is implemented.

Your migration has three operations

  • CreateModel
  • AddField
  • RemoveField

When the optimizer is run

    for i, operation in enumerate(operations):
        right = True  # Should we reduce on the right or on the left.
        # Compare it to each operation after it
        for j, other in enumerate(operations[i + 1 :]):
            result = operation.reduce(other, app_label)
            ....

The iteration starts with i=0, operation=CreateModel and looks for anything to .reduce

j=0, other=AddField is reducible so it got rolled into a single operation

        if isinstance(operation, AddField):
            return [
                CreateModel(
                    self.name,
                    fields=self.fields + [(operation.name, operation.field)],
                    options=self.options,
                    bases=self.bases,
                    managers=self.managers,
                ),
            ]

During i=0 iteration, j=1, other=RemoveField is not reducible because the field process_name didn't exist in the i=0, operation=CreateModel operation.

During second iteration, i=1, operation=AddField does look optimizible with RemoveField, but I guess the optimized operation were not chosen by the optimizer

            if isinstance(result, list):
                in_between = operations[i + 1 : i + j + 1]
                if right:
                    new_operations.extend(in_between)
                    new_operations.extend(result)
                elif all(op.reduce(other, app_label) is True for op in in_between):
                    # Perform a left reduction if all of the in-between
                    # operations can optimize through other.
                    new_operations.extend(result)
                    new_operations.extend(in_between)
                else:
                    # Otherwise keep trying.
                    new_operations.append(operation)
                    break
                new_operations.extend(operations[i + j + 2 :])
                return new_operations

I'm only taking a cursory look at the source code so am very likely interpreting things wrongly.

If your migration history is simple and all machine (production/dev/etc) is under your control. It might suffice to

  • first reverse your migration to the one before the squashing migrate data 0103
  • delete the migrations you want to squash (0107 to 0121) and generate a new set by makemigrations data again.
  • migrate normally

or with a bit of manual work

  • squashmigrations data 0107 0121 first
  • transition the squashed migration in to normal migration (doc), meaning
    • remove the replace attribute
    • update the dependencies attribute from other migration files to not use the original unsquashed migration (0107 to 0121) files
    • delete the replaced original migrations
  • squashmigrations data 0104 0107_squash_0121 again

Depends on your motivation, the whole manual wrangling of the migration files might not be worth it as other comment pointed out

There are other concerns in squashing migration that might not be obvious. I wrote a note here

Back to Top