How to structure a Django DRF project for handling multiple tests and responses
I am working on a Django project using Django REST Framework (DRF), where I need to define a list of tests. Each test consists of multiple questions, and there are corresponding answers for those questions.
My initial idea is to create an app called tests and use two CSV files:
One CSV file containing all the questions for different tests. Another CSV file containing all the answers. However, I am concerned that this approach might make the code confusing and difficult to maintain.
Questions: What is the best architecture for implementing this feature in Django DRF? Is storing test questions and answers in CSV files a good approach, or should I use models and a database instead? If there are any existing similar open-source projects on GitHub, could you provide references? I would appreciate any guidance or best practices on how to structure this project effectively.
Is storing test questions and answers in CSV files a good approach.
No. Files are a horrible way to store data. It is what people did in the very early days on magnetic tapes.
The main problem is that a file is a stream of data. From the moment you want to use the data for two or more purposes, you usually need a database. A database aims to store, retrieve, and aggregate data in an efficient way. By defining tables, you can query these in a variety of ways. Data usually is not meant for a single purpose. If you want to make data accessible in the sense that you can filter, order, retrieve, count, update, etc. that is where a database kicks in.
Databases also usually have a lot of tooling to run queries in parallel, and thus to serve multiple processes at the same time. Usually a file is written by one process/user, and then read by another, and this synchronous flow breaks concurrency.
an app called tests and use two CSV files.
You still can import csv files. Then you have a small process that performs some "gymnastics" on the csv data, and turns it into records at the database side.
Using models and DB is a better approach for maintainability: instead of manually managing CSV files, you can create a fixture in JSON or CSV format (very similar to what you already have) and load it using a custom Django management command.
Here an example load_questions.py
:
from django.core.management.base import BaseCommand
import csv
import json
from your_app.models import Question
class Command(BaseCommand):
help = "Load questions from a CSV file"
def handle(self, *args, **kwargs):
try:
with open("questions.csv", "r") as file:
reader = csv.DictReader(file)
for row in reader:
Question.objects.get_or_create(
text=row["text"],
defaults={"test_id": row["test_id"]}
)
self.stdout.write(self.style.SUCCESS("Questions loaded successfully"))
except Exception as e:
self.stderr.write(self.style.ERROR(f"Error: {e}"))
This avoids duplicate questions by using the get_or_create
from Django.
You can extend it to handle answers similarly.
If you have a lot of data and you need to optimize the process, you can consider bulk_create
too.
PS: I suggest you to find a better naming for your app, since "test" is commonly used to indicate files and folders related with code testing purposes.
Well, If you want to develop an Application that is scalable, usable and maintainable especially in Django and Django Rest Framework, you would probably want to use a model and database.
For Instance, in your project case, you have tests, multiple questions and corresponding answers. There are relationships among these models and using CSV Files would make things complicated. You probably would also want to perform complex queries that involved these relationships. A database and model is your best bet.
The Architecture
For the architecture, I will advice that you first map out the attributes corresponding to each of your Entities. By Entities, I mean the tests, questions and answers. Then check out their relationships and use that to define corresponding models for your database. I will give you some examples below
In the code below, you can see that the test and question are connected with a foreign key, this is because like you mentioned, a test can have many questions. Same is applicable to where question and answer are connected using a foreign key, because the question can have many answers.
Also, notice that the child model always have the foreign key.
Once you understand this, you are good to go.
You are also free to add or remove more fields to suit your project requirements
from django.db import models
class Test(models.Model):
name = models.CharField(max_length=255)
description = models.TextField(blank=True, null=True)
created_at = models.DateTimeField(auto_now_add=True)
def __str__(self):
return self.name
class Question(models.Model):
test = models.ForeignKey(Test, related_name='questions', on_delete=models.CASCADE)
text = models.TextField()
order = models.IntegerField(default=0) # To order questions within a test
def __str__(self):
return self.text
class Answer(models.Model):
question = models.ForeignKey(Question, related_name='answers', on_delete=models.CASCADE)
text = models.TextField()
is_correct = models.BooleanField(default=False) # To mark the correct answer
def __str__(self):
return self.text
Alternatively, if you already have your data in csv files and you would like to import them to the defined models, you can do that using Django management Command.
You just need to map your rows and columns to the corresponding fields and field values of your model.
Based on the model above, I will give you an example of how you can achieve that below
# management/commands/import_tests.py
import csv
from django.core.management.base import BaseCommand
from tests.models import Test, Question, Answer
class Command(BaseCommand):
help = 'Import tests, questions, and answers from CSV files'
def handle(self, *args, **kwargs):
with open('tests.csv', 'r') as test_file:
test_reader = csv.DictReader(test_file)
for row in test_reader:
test = Test.objects.get_or_create(name=row['name'], description=row['description'])
with open('questions.csv', 'r') as question_file:
question_reader = csv.DictReader(question_file)
for q_row in question_reader:
question = Question.objects.get_or_create(test=test, text=q_row['text'], order=q_row['order'])
with open('answers.csv', 'r') as answer_file:
answer_reader = csv.DictReader(answer_file)
for a_row in answer_reader:
Answer.objects.get_or_create(question=question, text=a_row['text'], is_correct=a_row['is_correct'])
Of course, this is a brute-force approach and you will probably want to optimise this if your CSV files are very large. You can then use bulk_create and add transaction to ensure atomicity of your data.