Celery memory leak in Django — worker memory keeps increasing and not released after tasks complete

I’m using Django + Celery for data crawling tasks, but the memory usage of the Celery worker keeps increasing over time and never goes down after each task is completed.

I’m using:

    celery==5.5.3
    Django==5.2.6

Here’s my Celery configuration:

# ---------- Broker/Backend ----------
    app.conf.broker_url = "sqs://"
    app.conf.result_backend = "rpc://"
    app.conf.task_ignore_result = True

# \---------- Queue (FIFO) ----------

    QUEUE_NAME = env("AWS_SQS_CELERY_NAME")
    app.conf.task_default_queue = QUEUE_NAME
    app.conf.task_queues = (Queue(QUEUE_NAME),)

# \---------- SQS transport ----------

    app.conf.broker_transport_options = {
        "region": env.str("AWS_REGION"),
        "predefined_queues": {
             QUEUE_NAME: {
                "url": env.str("AWS_CELERY_SQS_URL"),
                "access_key_id": env.str("AWS_ACCESS_KEY_ID"),
                "secret_access_key": env.str("AWS_SECRET_ACCESS_KEY"),
             },
         },
    # long-poll
         "wait_time_seconds": int(env("SQS_WAIT_TIME_SECONDS", default=10)),
         "polling_interval": float(env("SQS_POLLING_INTERVAL", default=0)),
         "visibility_timeout": int(env("SQS_VISIBILITY_TIMEOUT", default=900)),
         "create_missing_queues": False, # do not create queue automatically
    }

# \---------- Worker behavior ----------

     app.conf.worker_prefetch_multiplier = 1   # process one job at a time
     app.conf.task_acks_late = True            # ack after task completion
     app.conf.task_time_limit = int(env("CELERY_HARD_TIME_LIMIT", default=900))
     app.conf.task_soft_time_limit = int(env("CELERY_SOFT_TIME_LIMIT", default=600))
     app.conf.worker_send_task_events = False
     app.conf.task_send_sent_event = False
     app.autodiscover_tasks()

1

Problem: After each crawling task completes, the worker memory does not drop back — it only increases gradually. Restarting the Celery worker releases memory, so I believe it’s a leak or a cleanup issue.

What I’ve tried:

  • Set task_ignore_result=True
  • Add option --max-tasks-per-child=200

For one, --max-tasks-per-child at 200 is very large for crawling tasks that touch big responses, DOMs or ORM rows. 10–30 tasks per child is a sane starting point for you and that alone should fix 90% of “leaks.”

Then, if you haven't yet, you can free up memory by adding these settings to the environment of your worker:

MALLOC_ARENA_MAX=2
MALLOC_TRIM_THRESHOLD_=131072

This way, less memory is used up, and unused one is released - something Python isn't always doing too well.

These are supposed to address your memory issue - for the most part.

I've run into this issue before on my system. You can also set --max-memory-per-child=[...]to ensure the child process memory is contained. Basically it kills the child process once the memory limit is hit, frees the memory, spawns a new one, and it stays running as the master worker is still alive.

We use this command to start our celery container in docker compose and since this update no consistent memory leaks. Hosted on Ubuntu.

celery -A YOURSYSTEM worker --loglevel=info --max-memory-per-child=200000

This gives each process close to 200mb. Here are some docs showing it: https://docs.celeryq.dev/en/stable/reference/cli.html?utm_source=chatgpt.com

(Yes, docs found using AI, it's faster)

Вернуться на верх