Django: avoid multiple DB queries for recursive model
I have following models:
class Topic(models.Model): ... class Article(models.Model): ... class ArticleInTopic(models.Model): topic = models.ForeignKey(Topic, on_delete=models.PROTECT) article = models.ForeignKey(Article, on_delete=models.PROTECT) depends_on = models.ForeignKey(ArticleInTopic, on_delete=models.PROTECT) class Meta: unique_together = ('topic', 'article', 'depends_on')
With this models set I'm trying to express the following situation: there are some studying topics, each consists of multiple articles. However, in order to learn a topic one should read the articles related to the topic but not in any order, rather in order defined by article dependencies. This means that in context of a topic an article may have another article which is required to be read before reading this article. It is guaranteed that the article on which the current article depends on comes from the same topic.
So, basically, this whole structure looks like an acyclic (it is guaranteed) graph with parent-child nodes relation. In the business logic of my app I'm going to sort the graph topologically so I can tell the user which Article to read 1st, 2nd, etc.
However, the problem is the way Django fetches ForeignKey's data from the database. AFAIK it uses N+1 request to fetch all foreign keys. There is a solution to it - using
select_related but this only works when you can specify exact fields which need to be queried in the same request as the main info. In my case this is not possible, because I do not know in advance how many children each node has, so I cannot list them all in
Instead I was thinking about fetching all
ArticleInTopic objects which have the same
topic foreign key (topic name comes from user, so I know it by the time I need to show him Articles) and then topologically sort them in memory. But I am not sure whether Django will understand that it has fetched all the required object already when I will access one of objects
For example, I fetch 2
ArticleInTopic objects for topic 'Cars', lets say those objects are A and B. B depends on A. Django has queried them already and they are in memory. Now, what happens if I do
B.depends_on? Will Django make another request to the DB in order to select B? Or is it smart enough to understand that B has already been fetched by the previous request? If it is not, is there any way to prevent extra DB queries?