Celery memory leak in Django — worker memory keeps increasing and not released after tasks complete

I’m using Django + Celery for data crawling tasks, but the memory usage of the Celery worker keeps increasing over time and never goes down after each task is completed.

I’m using:

    celery==5.5.3
    Django==5.2.6

Here’s my Celery configuration:

# ---------- Broker/Backend ----------
    app.conf.broker_url = "sqs://"
    app.conf.result_backend = "rpc://"
    app.conf.task_ignore_result = True

# \---------- Queue (FIFO) ----------

    QUEUE_NAME = env("AWS_SQS_CELERY_NAME")
    app.conf.task_default_queue = QUEUE_NAME
    app.conf.task_queues = (Queue(QUEUE_NAME),)

# \---------- SQS transport ----------

    app.conf.broker_transport_options = {
        "region": env.str("AWS_REGION"),
        "predefined_queues": {
             QUEUE_NAME: {
                "url": env.str("AWS_CELERY_SQS_URL"),
                "access_key_id": env.str("AWS_ACCESS_KEY_ID"),
                "secret_access_key": env.str("AWS_SECRET_ACCESS_KEY"),
             },
         },
    # long-poll
         "wait_time_seconds": int(env("SQS_WAIT_TIME_SECONDS", default=10)),
         "polling_interval": float(env("SQS_POLLING_INTERVAL", default=0)),
         "visibility_timeout": int(env("SQS_VISIBILITY_TIMEOUT", default=900)),
         "create_missing_queues": False, # do not create queue automatically
    }

# \---------- Worker behavior ----------

     app.conf.worker_prefetch_multiplier = 1   # process one job at a time
     app.conf.task_acks_late = True            # ack after task completion
     app.conf.task_time_limit = int(env("CELERY_HARD_TIME_LIMIT", default=900))
     app.conf.task_soft_time_limit = int(env("CELERY_SOFT_TIME_LIMIT", default=600))
     app.conf.worker_send_task_events = False
     app.conf.task_send_sent_event = False
     app.autodiscover_tasks()

1

Problem: After each crawling task completes, the worker memory does not drop back — it only increases gradually. Restarting the Celery worker releases memory, so I believe it’s a leak or a cleanup issue.

What I’ve tried:

  • Set task_ignore_result=True
  • Add option --max-tasks-per-child=200

For one, --max-tasks-per-child at 200 is very large for crawling tasks that touch big responses, DOMs or ORM rows. 10–30 tasks per child is a sane starting point for you and that alone should fix 90% of “leaks.”

Then, if you haven't yet, you can free up memory by adding these settings to the environment of your worker:

MALLOC_ARENA_MAX=2
MALLOC_TRIM_THRESHOLD_=131072

This way, less memory is used up, and unused one is released - something Python isn't always doing too well.

These are supposed to address your memory issue - for the most part.

Вернуться на верх