Статьи о Джанго

Понимание вычисления и кэширования Django QuerySets

A QuerySet can be constructed, filtered, sliced, and generally passed around without actually hitting the database. No database activity actually occurs until you do something to evaluate the QuerySet.

Throughout the article we’ll refer to the following models:

from django.db import models

class Blog(models.Model):
    name = models.CharField(max_length=100)
    tagline = models.TextField()
    
    def __str__(self):
        return self.name

class Entry(models.Model):
    blog = models.ForeignKey(Blog, on_delete=models.CASCADE)
    headline = models.CharField(max_length=255)
    body_text = models.TextField(blank=True)

    class Meta:
        default_related_name = 'entries'

    def __str__(self):
        return self.headline

Take a look at following example to understand laziness :

q1 = Entry.objects.filter(blog=2)
q2 = q1.filter(headline__contains='food')
entry_list = list(q3)

Though this looks like two database hits, in fact it hits the database only once, at the last line (entry_list = list(q3)).

Each time you refine a QuerySet, you get a separate and distinct QuerySet that is not bound to the previous one, which can be stored, used and reused.

In the following code q and q3 does the same database operation. Note that, q1, q2, q3 are three distinct QuerySets.

q = Entry.objects.filter(blog=2).exclude(body_text__icontains="food")

q1 = Entry.objects.filter(blog=2)
q2 = q1.exclude(body_text__icontains="food")
q3 = q2[:10]

Now lets talk about evaluation and caching.

Evaluation means actually hitting the database.The moment you start iterating over a QuerySet, all the rows matched by the QuerySet are fetched from the database and converted into Django models. This is called evaluation.These models are then stored by the QuerySet's built-in cache, so that if you iterate over the QuerySet, you don’t need to hit the database again.


Enabling Cache

To enable cache in QuerySet, simply save the QuerySet in a variable and reuse it. Django QuerySet class has a _result_cache variable where it saves the query results (Django models) in list_result_cache is None if QuerySet does not have any cache, otherwise a list of model objects. When you are iterating over a cached QuerySet, you are basically iterating over a _result_cache , which is a list.

# The following will create two QuerySet's, evaluate them, and throw them away 
# because they are not saving the queryset anywhere to reuse them later.
print([e.headline for e in Entry.objects.all()])
print([e.pub_date for e in Entry.objects.all()])

# Following code saves QuerySet in a variable. When it evaluates, 
# it saves the results to its cache(_result_cache). 
queryset = Entry.objects.all()
# evaluation with iteration.
for each in queryset:
    print(each.headline)
    
# Using cache from previous evaluation.
for each in queryset:
    print(each.id)

Iteration is not the only way to evaluate, there are many other ways when evaluation happens, let’s discuss them with examples.


Iteration

A QuerySet is iterable and before the moment you start iterating your first row, database hits happen and the results are saved in cache. In the following example database hits and caching happens before printing the first headline:

queryset = Entry.objects.all()    
# Evaluated and cached
for each in queryset:
    print(each.headline)

# Using cache from previous evaluation.
for each in queryset:
    print(each.headline)

Slicing

Slicing a not-evaluated QuerySet returns a new QuerySet. The returned QuerySet does not allow further modifications (e.g., adding more filters, or modifying ordering) but it does allow more slicing. Queryset (either sliced or not) saves results to its cache if you iterate over it:

# You can't use filter to queryset anymore.
queryset = Entry.objects.all()[10:100]
# You can use filter to q1 but not to q2, q3.
q1 = Entry.objects.all()
q2 = q1[1:10]
q3 = q2[1:5]

# saves results to cache of q1
lst1 = [each.blog.id for each in q1]
# saves results to cache of q2
lst2 = [each.blog.id for each in q2]

If you slice the already evaluated Queryset it returns list of objects, not QuerySet objects, because after evaluation, when you iterate again, QuerySet uses its cached(_result_cache) value which is a list.

queryset = Entry.objects.all()
lst = list(queryset)
# returns a list of entry objects
first_ten = queryset[:10]
# list slicing not queryset slicing because first_ten is a list.
first_five = first_ten[:5]

If you use index to pick one element from a not evaluated QuerySet, it causes database hits but if you pick from an already evaluated QuerySet it uses cache.

queryset = Entry.objects.all()
# Queries the database because queryset hasn't been evaluated yet.
print(queryset[5])
# Queries the database because queryset hasn't been evaluated yet.
print(queryset[5])
lst = list(queryset)
# Using caches because evaluation happened in previous list().
print(queryset[5])
print(queryset[10])

An exception is if you use the step parameter of Python slice syntax to a not-evaluated QuerySet. In that case it executes the query immediately and returns a list of model objects, not a QuerySet object.

entry_list = Entry.objects.all()[1:100:2]

Pickling/Caching

If you pickle a QuerySet, it will be evaluated.


repr()

The repr() method returns a printable representational string of the given object. A QuerySet is evaluated when you call repr() on it but it does not save the results to cache.

# repr() evaluates but does not saves results to cache.
queryset = Entry.objects.all()
str_repr = repr(queryset)
# Not using cache.Hitting database again.
for each in queryset:
    print(each.headline)

Note : print() function also evaluates QuerySet but does not saves results to cache.


len()

A QuerySet is evaluated when you call len() and it saves evaluated results to cache.

# len() evaluates and saves results to cache.
queryset = Entry.objects.all()
ln = len(queryset)
# Using cache from previous evaluation.
for each in queryset:
    print(each.headline)

Note: Don’t use this if you only need to know the number of items in the QuerySet. Django provides a count() for this reason.


list()

Force evaluation of a QuerySet by calling list() on it which returns list of models objects and saves results in cache.

# Evaluates the queryset and saves results in cache.
queryset = Entry.objects.all()
lst = list(queryset)
# Using cache from previous list() evaluation.
for each in queryset:
    print(each.headline)

If Statement

An if statement will cause the query to be executed and saves results in cache. If there is at least one result, the QuerySet is True, otherwise False. For example:

# The `if` statement evaluates the queryset and saves results in cache.
queryset = Entry.objects.all()
if queryset:     
    # Using cache from previous if statement evaluation.     
    for each in queryset:         
        print(each.headline)

Related Model Attributes Are Not Cached

When Django evaluates a QuerySet, forward or backward relationships fields are not included in the query, and thus not included in the cache, unless you use select_related or prefetch_related.

queryset = Entry.objects.all()
    for each in queryset:
        print(each.headline)
        # Hits database for blog.
        print(each.blog.name)

    for each in queryset:
        # uses cache
        print(each.headline)
        # No cache, hits database again for blog.
        print(each.blog.name)

    # Use select_related or prefetch_related to cache related objects
    queryset = Entry.objects.select_related('blog')
    for each in queryset:
        print(each.headline)
        print(each.blog.name)

    for each in queryset:
        # uses cache
        print(each.headline)
        # uses cache
        print(each.blog.name)

Поделитесь с другими: