Django ORM generating insane subqueries with prefetch_related - 3 second page loads
I'm losing my mind here. Been working on this e-commerce site for months and suddenly the product catalog page takes 3+ seconds to load. The Django ORM is generating absolutely bonkers SQL with nested subqueries everywhere instead of simple JOINs.
Here's my setup (simplified):
class ProductManager(models.Manager):
def active(self):
return self.filter(is_active=True, category__is_active=True)
def with_reviews_stats(self):
return self.annotate(
avg_rating=Avg('reviews__rating'),
reviews_count=Count('reviews', distinct=True),
recent_reviews_count=Count(
'reviews',
filter=Q(reviews__created_at__gte=timezone.now() - timedelta(days=30)),
distinct=True
)
)
class Product(models.Model):
name = models.CharField(max_length=200)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10, decimal_places=2)
is_active = models.BooleanField(default=True)
created_at = models.DateTimeField(auto_now_add=True)
objects = ProductManager()
# ... other models (Category, Review, CartItem, Wishlist)
And here's the view that's killing me:
def product_catalog_view(request):
products = Product.objects.active().with_reviews_stats()
# Category filtering
if category_id := request.GET.get('category'):
category = get_object_or_404(Category, id=category_id, is_active=True)
subcategories = Category.objects.filter(
Q(parent=category) | Q(parent__parent=category) | Q(id=category.id)
).values_list('id', flat=True)
products = products.filter(category_id__in=subcategories)
# Sort by popularity
products = products.annotate(
popularity_score=F('reviews_count') * F('avg_rating')
).order_by('-popularity_score', '-created_at')
# Try to optimize (spoiler: doesn't work)
products = products.select_related('category').prefetch_related(
'reviews__user',
Prefetch(
'reviews',
queryset=Review.objects.filter(is_approved=True).select_related('user'),
to_attr='approved_reviews'
)
)
# Add user-specific data (this is where it gets ugly)
if request.user.is_authenticated:
user_cart_items = CartItem.objects.filter(user=request.user).values_list('product_id', flat=True)
user_wishlist_items = Wishlist.objects.filter(user=request.user).values_list('product_id', flat=True)
products = products.annotate(
in_cart=Case(
When(id__in=user_cart_items, then=Value(True)),
default=Value(False),
output_field=BooleanField()
),
in_wishlist=Case(
When(id__in=user_wishlist_items, then=Value(True)),
default=Value(False),
output_field=BooleanField()
),
cart_quantity=Subquery(
CartItem.objects.filter(
user=request.user, product=OuterRef('pk')
).values('quantity')[:1]
)
)
paginator = Paginator(products, 20)
page = paginator.get_page(request.GET.get('page', 1))
return render(request, 'catalog.html', {'products': page})
The generated SQL is absolutely mental - it's got like 5 different subqueries for the user cart/wishlist stuff instead of just doing LEFT JOINs. I ran EXPLAIN ANALYZE and PostgreSQL is spending 90% of the time on these subqueries.
What I've tried so far:
Added indexes on everything (helped a tiny bit) Tried only() and defer() - no difference
Rewrote parts in raw SQL - works 10x faster but defeats the purpose
Split into multiple queries - hello N+1 problem
Googled for hours, found nothing useful
The weird thing is if I remove the user-specific annotations (in_cart, in_wishlist, cart_quantity), everything runs fast. But I need that data for the frontend.
My questions:
Why is Django generating subqueries instead of JOINs for the Case/When stuff?
Is there a way to force Django to use JOINs here?
Am I doing something fundamentally wrong with how I'm structuring this query?
Should I just give up and use raw SQL for this
Environment: Django 4.2, PostgreSQL 14 ~50k products, ~200k reviews This used to work fine with smaller datasets
I'm seriously considering switching to raw SQL for this view but I really don't want to lose all the ORM benefits. Has anyone dealt with similar performance issues? Any ideas would be super appreciated!
EDIT: Just tried removing the custom manager methods and the problem persists, so it's definitely the user-specific annotations causing the issue.
EDIT 2: For anyone asking, yes I have proper indexes:
CREATE INDEX idx_product_category_active ON products_product(category_id, is_active);
CREATE INDEX idx_review_product_approved ON products_review(product_id, is_approved);
CREATE INDEX idx_cartitem_user_product ON products_cartitem(user_id, product_id);
-- etc.
Still slow as hell.