Connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already
I'm using celery with django to run some task that runs multiple threads querying the database. The example below is a simplified version of the original one in a larger project, but the logic and results are the same. The error occurs either way.
models.py
from django.db import models
class Example(models.Model):
name = models.CharField(max_length=255)
urls.py
from django.urls import path
from core.views import ExampleView
urlpatterns = [path('example/', ExampleView.as_view())]
views.py
from core.tasks import run_task
from rest_framework.response import Response
from rest_framework.views import APIView
class ExampleView(APIView):
def get(self, request, *args, **kwargs):
run_task.delay()
return Response(status=200)
tasks.py
from concurrent.futures import ThreadPoolExecutor, as_completed
from celery import shared_task
from core.models import Example
@shared_task
def run_task():
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [
executor.submit(Example.objects.get_or_create, name='example')
for _ in range(200)
]
for future in as_completed(futures):
future.result()
To reproduce, just run celery:
celery -A example worker -l INFO
and django then run:
curl 'http://localhost:8000/api/example/'
You should be seeing the error below:
[2026-04-12 15:08:49,151: ERROR/ForkPoolWorker-15] Task core.tasks.run_task[1b1682e1-3b9f-43e2-b02f-ba878449d2a5] raised unexpected: OperationalError('connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already\n')
Traceback (most recent call last):
File "/Users/users/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 279, in ensure_connection
self.connect()
~~~~~~~~~~~~^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 256, in connect
self.connection = self.get_new_connection(conn_params)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/postgresql/base.py", line 333, in get_new_connection
connection = self.Database.connect(**conn_params)
File "/Users/user/.local/lib/python3.14t/site-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/user/.local/lib/python3.14t/site-packages/celery/app/trace.py", line 585, in trace_task
R = retval = fun(*args, **kwargs)
~~~^^^^^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/celery/app/trace.py", line 858, in __protected_call__
return self.run(*args, **kwargs)
~~~~~~~~^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/postgres-issue/backend/core/tasks.py", line 15, in run_task
future.result()
~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14t/concurrent/futures/_base.py", line 443, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.14t/concurrent/futures/_base.py", line 395, in __get_result
raise self._exception
File "/usr/local/lib/python3.14t/concurrent/futures/thread.py", line 86, in run
result = ctx.run(self.task)
File "/usr/local/lib/python3.14t/concurrent/futures/thread.py", line 73, in run
return fn(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/query.py", line 987, in get_or_create
return self.get(**kwargs), False
~~~~~~~~^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/query.py", line 635, in get
num = len(clone)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/query.py", line 372, in __len__
self._fetch_all()
~~~~~~~~~~~~~~~^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/query.py", line 2000, in _fetch_all
self._result_cache = list(self._iterable_class(self))
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/query.py", line 95, in __iter__
results = compiler.execute_sql(
chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size
)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/models/sql/compiler.py", line 1622, in execute_sql
cursor = self.connection.cursor()
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 320, in cursor
return self._cursor()
~~~~~~~~~~~~^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 296, in _cursor
self.ensure_connection()
~~~~~~~~~~~~~~~~~~~~~~^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 278, in ensure_connection
with self.wrap_database_errors:
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/utils.py", line 94, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 279, in ensure_connection
self.connect()
~~~~~~~~~~~~^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/base/base.py", line 256, in connect
self.connection = self.get_new_connection(conn_params)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "/Users/user/.local/lib/python3.14t/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/user/.local/lib/python3.14t/site-packages/django/db/backends/postgresql/base.py", line 333, in get_new_connection
connection = self.Database.connect(**conn_params)
File "/Users/user/.local/lib/python3.14t/site-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already
I tried with different python versions: 3.14t, 3.14, postgres versions: 17, 18, django 5+, 6+, ... and the results are the same. Besides, lowering the amount of concurrency would prevent the error for the first run or few runs, then on the nth run, the error is hit again, so even if the below works, that doesn't mean the problem is solved.
@shared_task
def run_task():
with ThreadPoolExecutor(max_workers=os.cpu_count()) as executor:
futures = [
executor.submit(Example.objects.get_or_create, name='example')
for _ in range(200)
]
for future in as_completed(futures):
future.result()
I'm not sure why these connections aren't being closed / cleaned up automatically, which is probably the main issue here. I even tried manually cleaning connections to no avail.
from django.db import close_old_connections, connections
@shared_task
def run_task():
with ThreadPoolExecutor(max_workers=os.cpu_count()) as executor:
futures = [
executor.submit(Example.objects.get_or_create, name='example')
for _ in range(200)
]
for future in as_completed(futures):
future.result()
close_old_connections()
for connection in connections.all():
connection.close()
I'm running the above on my m4 max mbp Tahoe 26.4.1 + postgres 18 + python3.14t and the pip versions below:
celery 5.6.3
channels 4.3.2
channels_redis 4.3.0
daphne 4.2.1
Django 6.0.4
django-cors-headers 4.9.0
djangorestframework 3.17.1
djangorestframework_simplejwt 5.5.1
pip 26.0.1
psycopg2 2.9.11
redis 7.4.0
Sorry, @user31749517. Have not yet managed to repro.
% echo $(jot 3)
1 2 3
%
%
% time bash -c 'for i in $(jot 1000); do curl -s "http://localhost:8000/api/example/"; done'
bash -c 1.94s user 2.41s system 33% cpu 12.966 total
In two Terminal tabs I have that "pound django a thousand times" running concurrently, several times in sequence, and each run takes thirteen seconds. When there's no concurrency and I do it from just a single tab's shell, fetching a zero-byte document a thousand times takes ~ nine seconds.
See, no env vars, nothing up my sleeve:
% env | grep DJANGO | wc -l
0
Daphne log is very happy, strictly 200 Success documents being served.
Here is a tiny excerpt:
::1:50906 - - [12/Apr/2026:17:26:42] "GET /api/example/" 200 -
::1:50908 - - [12/Apr/2026:17:26:42] "GET /api/example/" 200 -
::1:50910 - - [12/Apr/2026:17:26:43] "GET /api/example/" 200 -
In addition to interpreter 3.14.2, here's a pair of components which might differ between us:
% brew info postgres
==> postgresql@18 ✔: stable 18.3 (bottled) [keg-only]
Object-relational database system
% brew info valkey
==> valkey ✔: stable 9.0.3 (bottled), HEAD
High-performance data structure server that primarily serves key/value workloads
https://valkey.io
Aha! Reproduced it.
I restart celery, with no daphne django running, and I get exactly fifty success lines like this:
[2026-04-12 17:42:04,698: INFO/MainProcess] Task core.tasks.run_task[fe69d7fa-6684-48e4-8b3a-6aeb71fdb80d] received
and then it falls apart with the "no postgres" symptom you reported:
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: connection to server on socket "/tmp/.s.PGSQL.5432" failed: Connection refused
Is the server running locally and accepting connections on that socket?
Furthermore it doesn't shutdown gracefully, so I CTRL/C a few times:
Is the server running locally and accepting connections on that socket?
^C
worker: Hitting Ctrl+C again will terminate all running tasks!
^C
Waiting gracefully for cold shutdown to complete...
worker: Cold shutdown (MainProcess)
^C[2026-04-12 17:43:59,655: WARNING/MainProcess] Restoring 40 unacknowledged message(s)
%
%
% jobs
%
Sometimes it comments on it having killed a worker process:
Is the server running locally and accepting connections on that socket?
^C
worker: Hitting Ctrl+C again will terminate all running tasks!
^C
Waiting gracefully for cold shutdown to complete...
worker: Cold shutdown (MainProcess)
^C[2026-04-12 18:04:20,666: ERROR/MainProcess] Process 'ForkPoolWorker-7' pid:14143 exited with 'signal 15 (SIGTERM)'
[2026-04-12 18:04:21,710: WARNING/MainProcess] Restoring 38 unacknowledged message(s)
%
%
I am debugging with this task code:
@shared_task
def run_task():
with ThreadPoolExecutor(max_workers=100) as executor:
futures = [
executor.submit(Example.objects.get_or_create, name="example")
for _ in range(200)
]
print(f"{len(futures)=}")
for future in as_completed(futures):
print(f"Asking for .result() from {future}")
print(f"{future.result()=}")
...
We never see that final .result() print any output:
[2026-04-12 18:09:02,302: INFO/MainProcess] Task core.tasks.run_task[4f6d3268-0dc9-4b08-b66a-a8d15bbf4e83] received
[2026-04-12 18:09:02,302: INFO/MainProcess] Task core.tasks.run_task[191fed8e-89dc-43f4-aec5-53a68b6ec4df] received
[2026-04-12 18:09:02,303: INFO/MainProcess] Task core.tasks.run_task[3840e8ea-4ded-4d78-b820-069c4e3ce9cc] received
[2026-04-12 18:09:02,304: INFO/MainProcess] Task core.tasks.run_task[364206f7-710b-45ba-995f-2fa15bffd7f0] received
[2026-04-12 18:09:02,304: INFO/MainProcess] Task core.tasks.run_task[06866ed5-defb-4c1a-965d-5ddaeb184fa3] received
[2026-04-12 18:09:02,304: WARNING/ForkPoolWorker-8] len(futures)=200
[2026-04-12 18:09:02,305: WARNING/ForkPoolWorker-9] len(futures)=200
[2026-04-12 18:09:02,305: WARNING/ForkPoolWorker-8] Asking for .result() from <Future at 0x10c9a56d0 state=finished raised OperationalError>
[2026-04-12 18:09:02,305: INFO/MainProcess] Task core.tasks.run_task[5c2ef3ce-b47e-46bb-a35f-1fbad0135ead] received
[2026-04-12 18:09:02,306: INFO/MainProcess] Task core.tasks.run_task[a2ce76ef-5f75-40cf-8f27-c69c88111c20] received
[2026-04-12 18:09:02,306: WARNING/ForkPoolWorker-2] len(futures)=200
[2026-04-12 18:09:02,306: WARNING/ForkPoolWorker-7] len(futures)=200
[2026-04-12 18:09:02,306: WARNING/ForkPoolWorker-2] Asking for .result() from <Future at 0x10ca0cb50 state=finished raised OperationalError>
[2026-04-12 18:09:02,306: WARNING/ForkPoolWorker-7] Asking for .result() from <Future at 0x10c9a6ad0 state=finished raised OperationalError>
[2026-04-12 18:09:02,307: INFO/MainProcess] Task core.tasks.run_task[4c0adcdc-2844-4c5d-aaf4-9978242a5330] received
If, despite Celery being a mess, I forge ahead to start Daphne
then the django webserver is happy and a web client
obtains a zero-byte 200 Success document.
% curl -v http://localhost:8000/api/example/
* Host localhost:8000 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8000...
* Connected to localhost (::1) port 8000
> GET /api/example/ HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Vary: Accept, Cookie
< Allow: GET, HEAD, OPTIONS
< X-Frame-Options: DENY
< Content-Length: 0
< X-Content-Type-Options: nosniff
< Referrer-Policy: same-origin
< Cross-Origin-Opener-Policy: same-origin
< Server: daphne
<
* Connection #0 to host localhost left intact
Still looking into it.
(I will soon delete this post, as it is not an Answer to the OP question, merely an interim progress report which Comments would not accommodate.)
This code doesn't leak.
It also doesn't support DB concurrency, as there is no ThreadExecutor.
from celery import shared_task
from django.db import connections
from core.models import Example
@shared_task
def run_task(reps: int = 200):
results = [
Example.objects.get_or_create(name='example', id=1)[1] for _ in range(reps)
]
assert len(results) == reps
assert not any(results)
for connection in connections.all():
connection.close()
Verify that we don't add 100 "stuck" SELECT statements on each call
with this psql query.
SELECT COUNT(*) FROM pg_stat_activity WHERE wait_event = 'ClientRead' AND state = 'idle';
BTW close_old_connections() is only for rather old ones,
like more than two seconds old.