Django.fun

What is the correct way to use multiprocessing and Django transaction blocks?

How can I use Multiprocessing in Python/Django management commands or views with transaction blocks. My intention is that in the view, for example, there's a lot of data being created but if any exception occur, I would like to rollback the transaction and nothing gets created. On the other hand, sometimes we create management commands-like scripts to run in migrations to fix data and we would also like to use multiprocessing to speed it up but with the option to do a dry-run of the script, so the transaction is rolled back.

I can't use bulk_create in these situations because the models I'm interested in modifying/creating are inherited models and bulk create does not apply to these type of models.

By wrapping either the handle(), some_func or the with Pool... block with transactions I get an error:

django.db.utils.InterfaceError: connection already closed
from multiprocessing import cpu_count, Pool

def worker_init():
    connection.close()

class Command(BaseCommand):
    # arguments here...
    
    def handle(self, *args, **options):
        self.commit = options['commit']

        try:
            # Wrapping the core of the script in a transaction block does not work
            # with transaction.atomic:
            items = [...]
            results = []
            with Pool(processes=cpu_count(), initializer=worker_init) as pool:
                for result in pool.imap_unordered(some_func, items):
                    results.extend(result)
            if not self.commit:
                raise Exception()

        except Exception:
            self.stdout.write(
                'DRY RUN: Command ran successfully but no changes were committed to the database.')

Disclaimer: I am no expert in Multiprocessing, I've just started to adventure myself in it trying to speed up our scripts.

Answers: 0