Статьи о Джанго

select_related и prefetch_related в Django

In Djangoselect_related and prefetch_related are designed to stop the deluge of database queries that are caused by accessing related objects.

I basically tried to figure out how and how many queries it reduces and, in this article, I will describe my findings.

You can find the source code on GitHub.

Throughout the article, we will be using the following models:

To test our function, we need to insert data into our models.

I wrote a management command for this reason which inserts five publishers, 100 books (20 books for each publisher) and 10 stores (10 books in each store). Just run python manage.py load_items.

I wrote a decorator to measure the execution time and the number of queries executed in a function.


select_related

We use select_related when the object that you're going to select is a single object, which means forward ForeignKey, OneToOne and backward OneToOne.

select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query.

Let’s dig into it with an example.

After running this function, the output shows:

Function :  book_list
Number of Queries : 101
Finished in : 0.08s

One query for populating all the books and, while iterating each time, we access the foreign key publisher that another separate query executes.

Let’s modify the query with select_related as follows and see what happens.

After running this function, the output shows:

Function :  book_list_select_related
Number of Queries : 1
Finished in : 0.02s

Isn’t it amazing? This query reduced 101 to 1. This is what select_related does.


prefetch_related

We use prefetch_related when we’re going to get a set of things.

That means forward ManyToMany and backward ManyToMany, ForeignKeyprefetch_related does a separate lookup for each relationship, and performs the “joining” in Python.

It is different from select_related, the prefetch_related made the JOIN using Python rather than in the database.

Let’s dig into it with an example.

After running this function, the output shows:

Function :  store_list
Number of Queries : 11
Finished in : 0.02s

We have 10 stores in the database and each store has 10 books. What’s happening here is one query for fetching all the stores and, while iterating through each store, another query is executing when we access the ManyToMany field books.

Let’s reduce the number of queries using prefetch_related.

After running this function, the output shows:

Function : store_list_prefetch_related
Number of Queries : 2
Finished in : 0.01s

Query performance improved here, 11 to 2 queries. I want you to understand what prefetch_related is doing here.

Let’s take another example for prefetch_related.

In the management command code, I set the book price randomly from 50 to 300. Now, we will find expensive books (price 250 to 300) in every store.

After running this function, the output shows:

Function :  store_list_expensive_books_prefetch_related
Number of Queries : 12
Finished in : 0.05s

Despite the fact that we are using prefetch_related, our queries increased rather than decreased. But why?

Using prefetch related, we are telling Django to give all the results to be JOINED, but when we use the filter(price__range=(250, 300)), we are changing the primary query and then Django doesn’t JOIN the right results for us.

This is the reason why we have 12 queries, 11 queries iterating over the stores and one query to get all the results in prefetch.

Let’s solve the problem with Prefetch.

After running this function, the output shows:

Function :  store_list_expensive_books_prefetch_related_efficient
Number of Queries : 2
Finished in : 0.03s

Mission successful! Two queries, not 12.

Перевод https://medium.com/better-programming/django-select-related-and-prefetch-related-f23043fd635d

Поделитесь с другими: