Performance issues when using Django's Min('id') in big table

I have a model Table which gets a new entry every few seconds, so it's a quite big table.

class Table(models.Model):
    id: int
    pk: int

    timestamp = models.PositiveBigIntegerField(db_index=True)

Now what i am trying to do is get the first entry for the last 24h. And for some reason it takes way to much time. Like we are talking about seconds. But if i try to get last i get it instantly. I've tried following queries:

from_ts = int(time.time()) - 24 * 60 * 60
first_entry = Table.objects.filter(timestamp__gte=from_ts).first()
first_entry = Table.objects.filter(timestamp__gte=from_ts).aggregate(Min('id'))

They both take a few seconds to complete but if I try .aggregate(Min('timestamp')) it returns almost instantly. I tried .filter(timestamp__gte=from_ts)[0] which is quick and does work but i know its undefined behaviour because the entries aren't sorted and in some edge cases it might not work. Then i tried to just get the whole query in python and find min in python and it was almost instant.

first_entry = min(list(Table.objects.filter(timestamp__gte=from_ts)))

Can anyone explain what exactly is happening here and what is the cleanest/best solution?

Your querysets are filtering on the timestamp and also ordering on the ID. Although the first query might not seems like it does that but .first() implicitly performs and order_by on the ID if there is no order already specified.

Although you have individual indexes on the ID and the timestamp, most likely the problem is that the database doesn't consider any suitable for your query resulting in it being inefficient (I'm not much of a DB person so not the best person to advice on indexes).

To make the query efficient using the index that you do have currently you can order on the timestamp and get the first object according to that order (which should ideally also be the first ID if your table stores data that way):

first_entry = Table.objects.filter(timestamp__gte=from_ts).order_by("timestamp").first()

Just Add an index to timestamp and id:

class Table(models.Model):
    timestamp = models.PositiveBigIntegerField(db_index=True)

    class Meta:
        indexes = [
            models.Index(fields=['timestamp', 'id']),
        ]

Then use this query to get the first entry from the last 24 hours.

first_entry = Table.objects.filter(timestamp__gte=from_ts).order_by('id').first()

This will be much faster because the database can use the combined index to both filter and sort without scanning the entire table.

@Abdul Aziz Barkat answered the question. I just want to add a bit of extra context.

Databases use indexes to filter and sort, but these can also be used to retrieve data if no other data is required. Indeed, you have put a database index on the timestamp field. This means that if you make queries on the timestamp, the database can use this to determine efficiently what records are within the given range, without having to go through all of these records.

If you additionally want to sort on these, it can sort the records "in advance" without retrieving these, since it knows that the smallest timestamp is for the timestamps matching the filter: it can use the index for this. Both queries don't require to fetch data from the records.

If you now want to retrieve the smallest id, the database has no choice but to start fetching all records that satisfy the filter condition, and look at the ids. Fetching records is usually a lot more expensive: it requires accessing the database storage, often these records are not even "packed" together at the same location, so it can result in reading a lot of data into memory.

If you want to retrieve the first record by timestamp however, it does not have to fetch all the data. Indeed, if you make a query:

SELECT id
FROM my_table
WHERE timestamp >= 2025-06-01
ORDER BY timestamp
LIMIT 1

it can use the index to determine the single record it has to retrieve, and retrieve that with one small fetch.

Вернуться на верх