Cap the number of model rows without a race condition

I'm writing a Django app to accept webmentions. New webmentions get placed into a model called "Pending" until the admin runs a verification action on them.

Successfully verified rows are then deleted from "Pending" and moved over to a different model called "Webmention" that has more columns for info gathered during verification.

I would like the first model, "Pending", to be capped at a number of rows. Ideally about 1000 rows. Once 1000 rows is reached, the submission endpoint should return a "503 Service Unavailable" error until the human webmaster can jump in and figure out why they suddenly woke up to 1000 webmentions.

What I've done so far

# Gimme that Low Number
# (enforced)
if Pending.objects.all().count() >= 1000:
    return HttpResponse("Too many webmentions enqueued", status=503)

# Sweet creator Call and thats why I have hard drive storage
Pending.objects.create(
    source=source_url,
    target=target_url
)

I'm worried this contains a race condition risk

What I'm worried about is the part between the count check and the create() call. If two clients submit at the same time, isn't there a chance both count checks return 999, then both do create() and breach the 1000 row limit?

Is there a way to somehow prevent race conditions here?

I looked at this Limit the number of rows in a Django table and it seemed related, but they're using the model as a queue that deletes the oldest row. That's not really what I'm going for. Ideally, I'd like to keep the oldest row and just deny service if it's full.

You need to lock a Pending table until operation is complete.

You can use solutions like pg_lock library https://django-pglock.readthedocs.io/en/1.8.0/model/

from django.db import transaction
import pglock

with transaction.atomic():
    pglock.model("your_app.Pending")
    if Pending.objects.count() >= 1000:
        return HttpResponse("Too many webmentions enqueued", status=503)


    Pending.objects.create(
        source=source_url,
        target=target_url
    )

You can also run raw query. See: https://stackoverflow.com/a/54403001/17185927

cursor.execute(f'LOCK TABLE {model._meta.db_table}')

But I have a different question first:

Is it really critical to make sure that there are only 1000 pending objects? why 999 is fine, but 1001 is unacceptable?

Race condition is possible here, but after that it will throw an error and in the worst case you have few more objects then expected

Вернуться на верх

Последние вопросы и ответы

Django tests in GitLab CI always use PostgreSQL instead of SQLite despite APP_ENV override

Serializers Prefetch in View

How to separate local and production settings in django?

How to fix issue with passing class instances between methods in Python (Django context)?

How to properly store image dimensions in Django

ModuleNotFoundError: No module named 'pkg_resources' With Django Project

Handling user registration and subsequent profile creation

Obtener usuarios de keycloak

Implementing HTMX in a Django app: Should I use two templates per view?

Django vs FastAPI for building a Retrieval-Augmented Generation (RAG) system [closed]

Cap the number of model rows without a race condition

What I've done so far

I'm worried this contains a race condition risk

Последние вопросы и ответы

Рекомендуемые записи по теме