Celery crashes when PgBouncer closes idle connections (idle timeouts enabled)

I’m encountering an issue when running Celery with PgBouncer and PostgreSQL after enabling idle connection timeouts.

My stack includes:

  • Django (served via Tornado)

  • Celery (workers + beat)

  • PostgreSQL

  • PgBouncer (in front of PostgreSQL)

Due to a large number of idle database connections caused by Tornado + Django, I introduced idle timeout settings to protect PostgreSQL from running out of connections.

PgBouncer

idle_transaction_timeout=240 (4mins)
client_idle_timeout=240

PostgreSQL

idle_in_transaction_session_timeout=300000 (5mins)
idle_session_timeout=300000 (5mins)

Problem:

After applying these settings, Celery occasionally crashes with the following error:

[2025-12-16 06:12:01,578: ERROR/MainProcess] Unrecoverable error: DatabaseError('client_idle_timeout\nserver closed the connection unexpectedly\n\tThis probably means the server terminated abnormally\n\tbefore or while processing the request.\n',)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 351, in start
    component.start()
  File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 393, in start
    self.consume_messages()
  File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 885, in consume_messages
    self.connection.drain_events(timeout=10.0)
  File "/usr/local/lib/python2.7/site-packages/kombu/connection.py", line 276, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 760, in drain_events
    item, channel = get(timeout=timeout)
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/scheduling.py", line 39, in get
    return self.fun(resource, **kwargs), resource
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 780, in _drain_channel
    return channel.drain_events(timeout=timeout)
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 578, in drain_events
    return self._poll(self.cycle, timeout=timeout)
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 287, in _poll
    return cycle.get()
  File "/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/scheduling.py", line 39, in get
    return self.fun(resource, **kwargs), resource
  File "/usr/local/lib/python2.7/site-packages/djkombu/transport.py", line 31, in _get
    m = Queue.objects.fetch(queue)
  File "/usr/local/lib/python2.7/site-packages/djkombu/managers.py", line 18, in fetch
    queue = self.get(name=queue_name)
  File "/usr/local/lib/python2.7/site-packages/django/db/models/manager.py", line 132, in get
    return self.get_query_set().get(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 344, in get
    num = len(clone)
  File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 82, in __len__
    self._result_cache = list(self.iterator())
  File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 273, in iterator
    for row in compiler.results_iter():
  File "/usr/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 680, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/usr/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 735, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 44, in execute
    return self.cursor.execute(query, args)
DatabaseError: client_idle_timeout
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

[2025-12-16 06:12:02,291: INFO/MainProcess] Celerybeat: Shutting down...

Questions:

  • Is this a known issue when using Celery with PgBouncer idle timeouts?

  • Are these timeout values incompatible with long-running Celery workers?

  • What is the recommended way to configure PgBouncer/PostgreSQL idle timeouts when Celery is involved?

Any guidance or best practices would be greatly appreciated. Thanks in advance!

Вернуться на верх