Cap the number of model rows without a race condition
I'm writing a Django app to accept webmentions. New webmentions get placed into a model called "Pending" until the admin runs a verification action on them.
Successfully verified rows are then deleted from "Pending" and moved over to a different model called "Webmention" that has more columns for info gathered during verification.
I would like the first model, "Pending", to be capped at a number of rows. Ideally about 1000 rows. Once 1000 rows is reached, the submission endpoint should return a "503 Service Unavailable" error until the human webmaster can jump in and figure out why they suddenly woke up to 1000 webmentions.
What I've done so far
# Gimme that Low Number
# (enforced)
if Pending.objects.all().count() >= 1000:
return HttpResponse("Too many webmentions enqueued", status=503)
# Sweet creator Call and thats why I have hard drive storage
Pending.objects.create(
source=source_url,
target=target_url
)
I'm worried this contains a race condition risk
What I'm worried about is the part between the count check and the create() call. If two clients submit at the same time, isn't there a chance both count checks return 999, then both do create() and breach the 1000 row limit?
Is there a way to somehow prevent race conditions here?
I looked at this Limit the number of rows in a Django table and it seemed related, but they're using the model as a queue that deletes the oldest row. That's not really what I'm going for. Ideally, I'd like to keep the oldest row and just deny service if it's full.
You need to lock a Pending table until operation is complete.
You can use solutions like pg_lock library https://django-pglock.readthedocs.io/en/1.8.0/model/
from django.db import transaction
import pglock
with transaction.atomic():
pglock.model("your_app.Pending")
if Pending.objects.count() >= 1000:
return HttpResponse("Too many webmentions enqueued", status=503)
Pending.objects.create(
source=source_url,
target=target_url
)
You can also run raw query. See: https://stackoverflow.com/a/54403001/17185927
cursor.execute(f'LOCK TABLE {model._meta.db_table}')
But I have a different question first:
Is it really critical to make sure that there are only 1000 pending objects? why 999 is fine, but 1001 is unacceptable?
Race condition is possible here, but after that it will throw an error and in the worst case you have few more objects then expected