Sending large base64 files through RabbitMQ to consume on workers
I'm using RabbitMQ and Celery to process email attachments using the gmail API. In my first celery task
it fetches batches of emails with large attachments in base64 strings greater than 25mb per file. The current RabbitMQ default limit is 16mb, but I don't want to raise it because I read a few articles about how keeping the message size small is a better practice.
What is the best practice here? While the first task is pulling emails, I want to create multiple other celery workers
that processes those files (with OCR and storing it in a database) concurrently to optimize the speed of the process.
A few solutions (that I'm not sure if it's a good practice because I'm a newbie) I came up with:
Raising the RabbitMQ message size limit
Storing the file in memory and referencing that in the second celery task (Not sure if this is a good idea, because my server I'm running is 32gb of ram)
In the first celery task that's pulling emails, I can directly upload that to a cloud storage service, and then reference that url to the file in the second celery task. But the downside of that is I would have to upload the file, and then redownload the file in order to do OCR extraction on it, which doesn't seem efficient. (Also increased costs because of the bandwidth usage)
Is there another solution for my design problem here?