Django imported database -- am I doing it wrong? Is Django not fit to purpose here?
So perhaps I'm missing something conceptually, or simply doing it wrong. Or, worst case scenario, "that's not what Django is for" :-/
I have set up my Django instance to fully take advantage of an external database which contains, amongst other things, all of my iTunes (no, not Music) data, with an extensive changelog. That's tangential to my question but I wanted to throw it out to show that I'm not a complete beginner (I even have it set so Django doesn't mess w/the external read-only database, which is a bit of a feat).
What I'm trying to do is import this into Django models in my actual Django database, to take advantage of the Django admin, preparatory to creating a web app around all of this.
I've set up the Django models, and created some (SQLAlchemy) routines to ETL my data into three tables (Songs, Artists, Albums). I have a lot of data (783122 songs from 59445 albums by 97562 artists), but that shouldn't overload a robust framework like Django, I wouldn't think.
So the data's imported & I've set up a basic admin which should simply show the songs inline when I go to the admin page for an artist or album. Attempting to do so throws no errors, but results in endless spinning & no access to the album or artist page.
Initially I thought, well, all those natural keys I'm using are finally biting me on the ass, so I jumped through a bunch of hoops to translate every natural key to a simple bigint.
This hasn't helped. I'm watching the wheel spin 30 minutes later as I type.
One thought I had was, well, since a huge amount of data just got imported, it may take a while to index all those PKs. I suppose this is possible and I'm keeping an eye on it, but already this seems like an excessive amount of time for a setup where the biggest table is <1M rows.
Any thoughts? Would be interested in hearing from folks who have some experience in this area before I just dump my code. Should I just try Flask? I'm at a bit of a loss as to how to even start debugging this issue.
This is a known issue with Django. If you want to edit the details of a ItunesAlbum
, then it will thus fetch all the related ItunesSong
, which is done with one query. So at first sight it does not look as a very problematic thing.
But these models here have ForeignKey
s, and a ForeignKey
is, by default, resolved by a ModelChoiceField
form field [Django-doc]. This thus implies that it will fetch all models of the related field, because if you open the drop down, it thus has to list all options.
This means that for each foreign key, and for each inline object, it makes a query. So for an album with 10 songs, it will at least fetch all artists three times. Which of course generates a lot of queries.
The good news is, that we can, mitigate this problem. By preselecting the ForeignKey
s for the inline model, and by using an autocomplete field that prevents fetching all artists, so:
class AlbumArtistSongInline(admin.TabularInline):
model = ItunesSong
autocomplete_fields = ['artistid', 'expressedartistid', 'albumartistid']
def get_queryset(self, *args, **kwargs):
return (
super()
.get_queryset(*args, **kwargs)
.select_related('artistid', 'expressedartistid', 'albumartistid')
)