Celery, RabbitMQ removes worker from consumers list while it is performing tasks
I have started my celery worker, which uses RabbitMQ as broker, like this:
celery -A my_app worker -l info -P gevent -c 100 --prefetch-multiplier=1 -Q my_app
Then I have task which looks quite like this:
@shared_task(queue='my_app', default_retry_delay=10, max_retries=1, time_limit=8 * 60)
def example_task():
# getting queryset with some filtering
my_models = MyModel.objects.filter(...)
for my_model in my_models.iterator():
my_model.execute_something()
Sometimes this task can be fininshed less than a minute and sometimes, during highload, it requires more than 5 minutes to finish.
The main problem is that RabbitMQ constantly removes my worker from consumers list. It looks really random. Because of that I need to restart worker again.
Workers also starts throwing these errors:
SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
Sometimes these errors:
consumer: Cannot connect to amqps://my_app:**@example.com:5671/prod: SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)').
Couldn't ack 2057, reason:"RecoverableConnectionError(None, 'connection already closed', None, '')"
I have tried to add --without-heartbeat
but it does nothing.
How to solve this problems? Sometimes my tasks takes more than 30 minutes to finish, and I can't constantly monitor if workers were kicked out from rabbitmq.