How to denormalize hierarchical Django model data for rendering CSV or Excel with Djange REST Framework?
Assume, we are building an address book application with Django REST Framework and want to render an endpoint that exports all persons. Each person can have one or multiple phone numbers.
Exemplary data could look like this:
[
{
'name': 'Jon Doe',
'phone':
[
{
'type': 'home',
'number': '+1 234 5678'
}
]
},
{
'name': 'Jane Doe',
'phone':
[
{
'type': 'home',
'number': '+2 345 6789'
},
{
'type': 'work',
'number': '+3 456 7890'
}
]
}
]
As we want to export CSV or Excel tables, we want to denormalize the data so that each phone number gets its own line.
A result could look like this:
name,phone.type,phone.number
Jon Doe,home,+1 234 5678
Jane Doe,home,+2 345 6789
Jane Doe,work,+3 456 7890
The question is, where exactly I would do the denormalization. I see two options:
- Write a custom
Serializer
that does the denormalization. On the upside, this would result in a single change that works for everyRenderer
, so I could have the endpoint export CSV and Excel using, e.g.djangorestframework-csv
anddrf-renderer-xlsx
. On the downside, this would interfere with renderers that do not benefit from denormalization like JSON or XML. - Derive each
Renderer
that needs denormalization and override theprocess_data()
method to first denormalize and then invoke the super-class implementation. - Write a custom
View
that does the denormalization based on the negotiated renderer, like described in https://www.django-rest-framework.org/api-guide/renderers/#varying-behavior-by-media-type.
This seems like an issue that many people could have, as tabular data export is a very common feature. Any hints on where I should start or what would be the best alternative?
Assuming you have a model Contact
(replace this for whatever model you have), using Pandas you can return a CSV file from a Django ORM QuerySet
.
import pandas as pd
from .models import Contact
def export_contacts(self, *args, **kwargs):
queryset = Contact.objects.all()
df = pd.DataFrame(list(queryset))
return df.to_csv()
You can add this into a separated module you can call directly from the view, new view for this, or wherever you need it.