When deployed to the server (in production), the Gemini API server's streaming responses are not received
It works fine locally. In the production environment, non-streaming responses work well. When I send a Gemini API request directly from the AWS server using curl, the streaming works perfectly. However, when I run the backend server with Gunicorn, the Gemini API streaming does not work.
I tried using both gevent and eventlet for the worker class, but neither worked.
It doesn't seem to be an issue with the nginx configuration, since streaming responses to the client are sent correctly. The problem only occurs when the server calls the outer API.
Actually, everything worked fine when I first set up the server. But ever since I set up the CI/CD pipeline, (I just tried to use github actions and install requirements remote) external API streaming doesn't work.
Since it's a Django codebase, even deleting the virtual environment and reinstalling, or installing using the requirements.txt from before the CI/CD build (when it was working fine), doesn't resolve the issue.
What should I do?
What I got from gunicorn logs
[2025-02-14 14:29:50 +0000] [5235] [CRITICAL] WORKER TIMEOUT (pid:5283)
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: response:
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: GenerateContentResponse(
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: done=False,
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: iterator=<_StreamingResponseIterator>,
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: result=protos.GenerateContentResponse({
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "candidates": [
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: {
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "content": {
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "parts": [
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: {
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "text": "\""
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: }
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: ],
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "role": "model"
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: }
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: }
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: ],
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "usage_metadata": {
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "prompt_token_count": 1107,
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "total_token_count": 1107
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: },
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: "model_version": "gemini-2.0-flash"
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: }),
Feb 14 14:29:50 ip-172-31-13-205 gunicorn[5283]: )
Part of my nginx config
upstream django_app{
server unix:/run/gunicorn.sock;
}
server{
server_name {MY_SERVER_NAME};
location / {
include proxy_params;
proxy_pass http://django_app/;
proxy_http_version 1.1;
proxy_buffering off;
proxy_set_header Connection "";
chunked_transfer_encoding on;
}
My gunicorn config
[Unit]
Description=gunicorn daemon of backend django
Requires=gunicorn.socket
After=network.target
[Service]
User=ubuntu
Group=www-data
WorkingDirectory=/home/ubuntu/my_server_name
Environment="PATH=/home/ubuntu/venv/bin" "DJANGO_ENV=prod"
ExecStart=/home/ubuntu/venv/bin/gunicorn -k eventlet --access-logfile - --workers 3 --log-level debug --bind unix:/run/gunicorn.sock my_django_app.wsgi:application
[Install]
WantedBy=multi-user.target
What works fine:
- the other requests except that needs gemini API (streaming)
- non-streaming gemini API calls work
- calling gemini API streaming directly by 'curl' in AWS server
What I used
- gunicorn (I tried both eventlet, gevent)
- AWS
- Django
- nginx
What I've tried
- gevent, eventlet
- Downgrading python to 3.11
- Downgrading grpcio, grpcio-status to 1.67.1