Django queryset stops working after multiple uses and filters

I am currently working on a task which requires me to divide my model data into multiple parts based on conditions and then process them differently. I am able to get data from my model correctly and check all available details that they exists. But after applying certain number of filters, suddenly the queryset stops working and don't give the correct results. I don't know why this is happening, can someone explain? Here is part of my code that is causing issue:

def test_selection():
    po_numbers = (
        Model1.objects
        .filter(schedule__isnull=True)
        .filter(order_hold_type__isnull=True)
        .filter(so_line_status='Awaiting Supply')
        .values_list("customer_po", flat=True)
        .distinct()
    )
    rows = Model1.objects.filter(customer_po__in=po_numbers).annotate(
            total_qty=Sum(F('line_item_quantity')*F('cup_count'))
        ).all()
    
    old_sizes = Sizes.get_old_sizes()
    old_po = rows.filter(
        item_size__in=old_sizes).values_list('customer_po', flat=True).distinct()
    rows = rows.filter(~Q(customer_po__in=old_po)).all()

    partially_scheduled = rows.filter(
        customer_po__in=list(rows.filter(schedule__isnull=False).values_list(
            'customer_po', flat=True).distinct())
    ).all()

    unscheduled = rows.filter(
        ~Q(customer_po__in=list(partially_scheduled.values_list(
            'customer_po', flat=True).distinct().all()))
    ).all()

    model2_po = Model2.objects.values_list('customer_po', flat=True).distinct()
    g1 = unscheduled.exclude(customer_po__in=model2_po)
    po_list = g1.values_list('customer_po', flat=True).distinct()
    print(rows.count(), po_list)

I thought the problem is in my flow so I created a separate function to test this and this is the point after which the problem starts occurring. This is not throwing any error but coming up empty and I have tried searching for anyone having similar issue but cannot find anything. I have tried following things uptil now:

  • Use filter(~Q()) instead of exclude to filter data
  • Use all() so it creates copy of the queryset instead of working on old one
  • Making all items to filter as a list instead of directly using querysets

But I am still not able to find what might be happening wrong over here. In the last 3 lines of the code, I am getting g1.count() as 546 lines of data but the PO is coming up as empty. But if I use unscheduled directly AND remove g1 line then it shows 43 values in list.

    g1 = unscheduled.exclude(customer_po__in=model2_po)
    po_list = unscheduled.exclude(customer_po__in=model2_po).values_list('customer_po', flat=True).distinct()

This gives g1.count() = 546, po_list = empty.

    # g1 = unscheduled.exclude(customer_po__in=model2_po) #commented code
    po_list = unscheduled.exclude(customer_po__in=model2_po).values_list('customer_po', flat=True).distinct()

This gives po_list = queryset of 43 items list.

Avoid calling .all() unnecessarily.

Optimize and validate intermediate results.

Be cautious with list conversions from querysets "list(rows.filter(...).values_list(...)) can be problematic if rows has a large number of records because converting a queryset to a list evaluates it immediately. It’s better to use queryset chaining directly where possible."

Print or log intermediate results to validate correctness. For example, check the result of each filter step to ensure they are returning what you expect.

Here’s a modified version of your code with improvements and debugging steps:

def test_selection():

po_numbers = (
    Model1.objects
    .filter(schedule__isnull=True, order_hold_type__isnull=True, so_line_status='Awaiting Supply')
    .values_list("customer_po", flat=True)
    .distinct()
)
rows = Model1.objects.filter(customer_po__in=po_numbers).annotate(
    total_qty=Sum(F('line_item_quantity') * F('cup_count'))
)

old_sizes = Sizes.get_old_sizes()
old_po = rows.filter(item_size__in=old_sizes).values_list('customer_po', flat=True).distinct()
rows = rows.exclude(customer_po__in=old_po)

partially_scheduled = rows.filter(schedule__isnull=False).values_list('customer_po', flat=True).distinct()
unscheduled = rows.exclude(customer_po__in=partially_scheduled)

model2_po = Model2.objects.values_list('customer_po', flat=True).distinct()
g1 = unscheduled.exclude(customer_po__in=model2_po)

po_list = g1.values_list('customer_po', flat=True).distinct()

print(rows.count(), len(po_list))
Back to Top