Finding performance bottlenecks in a Django project

Intro

When optimizing the performance of a web application, it's a common mistake — to start by optimizing the slowest page (or API). In addition to considering the response time, we also need to consider the traffic it receives to determine the optimization order.

In this article, we'll walk through a Django web application, find performance bottlenecks, and then start optimizing them for better performance. See also a tip on how to find out what SQL queries your Django project is running and how long they take.

Profiling

django-silk — is an open source profiling tool that captures and stores HTTP request data. Install it with pip:

pip install django-silk

Add silk to the installed packages and to the middleware in the project settings:

MIDDLEWARE = [
     ...
     'silk.middleware.SilkyMiddleware',
     ...
]

INSTALLED_APPS = (
     ...
     'silk'
)

Run the migration so that Silk can create the necessary database tables to store the profiling data:

$ python manage.py makemigrations
$ python manage.py migrate
$ python manage.py collectstatic

On the silk requests page (http://host/silk/requests/) we can see all requests and sort them by total time or time spent in the database.

Bottlenecks

Silk creates a silk_request table that contains information about requests processed by Django.

$pgcli

library> \dsilk_request;

+--------------------+--------------------------------+- ------------+
| column | type | Modifiers |
|--------------------+--------------------------------+- ------------|
| id | character varying(36) | not null |
| path | character varying(190) | not null |
| time_taken | double precision | not null |
...

We can group these requests data by path, calculate the number of requests, average execution time, and impact ratio of each path. Since we are considering response time and traffic, the impact factor will be the product of the average response time and the number of requests along this path.

library> SELECT
      s.*, round((s.avg_time * s.count)/max(s.avg_time*s.count) over ()::NUMERIC,2) as impact
  FROM
      (select path, round(avg(time_taken)::numeric,2) as avg_time, count(path) as count from silk_request group by PATH)
      s
  ORDER BY impact DESC;

+-------------------------+------------+---------+ ----------+
| path | avg_time | count | impact |
|-------------------------+------------+---------+ ----------|
| /point/book/book/ | 239.90 | 1400 | 1.00 |
| /point/book/data/ | 94.81 | 1900 | 0.54 |
| /point/ | 152.49 | 900 | 0.41 |
| /point/login/ | 307.03 | 400 | 0.37 |
| / | 106.51 | 1000 | 0.32 |
| /point/auth/user/ | 494.11 | 200 | 0.29 |
...

We can see that /point/book/book/ has the most impact, although it's not the most visited link nor the slowest view. Optimizing this view primarily improves the overall performance of the web application.

Conclusion

In this article, we learned how easy it is to start profiling a Django web application and identify performance bottlenecks. In the next article, we will learn how to optimize these bottlenecks by looking at them in detail.

Back to Top