Django ORM filtering using dates and data on separate one to many rows with the same related_name

These three models track manufacturing processes through a mix of user defined statuses, and three predefined statuses, which are 'Initialized', 'Scrapped', and 'Complete'. Every processes first status is always "Initialized" and their final status is always either "Scrapped", or "Complete". All statuses in between are user defined and don't matter for this query.

models.py 
#simplified for clarity

class Process(models.Model):
    name = models.CharField(max_length=50)

class Status(models.Model):
    name = models.CharField(max_length=50)

class ProcessStatusHistory(models.Model):
    timestamp = models.DateTimeField()
    prev_status = models.ForeignKey(Status,related_name="prev_status", on_delete=models.CASCADE)
    new_status = models.ForeignKey(Status,related_name="new_status", on_delete=models.CASCADE)
    process = models.ForeignKey(PatientVisit,related_name="of_process", on_delete=models.CASCADE)

The user can create a report with a date range picker. What is the cleanest way to ask the DB for the following:

All ProcessStatusHistory rows (even outside of the date range) for processes that were: not 'Initialized' after the later date in the date range picker (case: Process not yet started for date range) AND were not yet 'Scrapped' OR 'Complete' before the earlier date in the date range picker (case: Process already finished for date range). In essence, I am aiming to get all ProcessStatusHistory information for all Processes that were 'Open' at any point in the date range picker.

My attempt so far, which works inefficiently, but does work is:

  1. Make a list of PKs that were started before the later_date
  2. Create lists of process pk's that were scrapped or completed before the earlier_date
  3. Removing the pks of the scrapped and completed processes from the first list
  4. Using the final filtered list to get the relevant ProccessStatusHistory rows
initialized_before = Process.objects
  .filter(of_process__timestamp__lte=later_date, of_process__prev_status__name="Initialized")
  .values_list('pk', flat=True)

scrapped_before = Process.objects
  .filter(of_process__timestamp__lte=earlier_date,of_process__new_status__name="Scrapped")
  .values_list('pk', flat=True)
completed_before = Process.objects
  .filter(of_process__timestamp__lte=earlier_date,of_process__new_status__name="Complete")
  .values_list('pk', flat=True)

filtered_process_ids = initialized_before.difference(scrapped_before)
filtered_process_ids = filtered_process_ids.difference(completed_before)

filtered_status_histories = ProcessStatusHistory.objects.filter(process__pk__in=filtered_process_ids)

There is likely a way to get a list of these only these Processes with Q(), and then prefetch_related('of_process') at the end, but I am having trouble getting it right. I haven't been able to make.exclude() work the way I want either. I wish I had a way to say: check the most recent ProcessStatusHistory before the early_date to see if the new_status__name == 'Scrapped' OR 'Complete' and if so, do not include that process.

Maybe I need a Serializer? Do my Models need to be configured differently?

Ultimately, this data will be put in a pd dataframe, but I don't think this matters, just including it in case it does.

Thank you so much in advance.

Back to Top