Correct way to store Python function definitions (names) in a database?
Context - Skip to "Crux" for tl;dr:
I'm building a report automation system that includes a handful of independent "worker" daemons each with their own APScheduler instance, one central "control panel" web application in Django, and using ZMQ to manage communication between Django and the workers.
The workers tasks involve querying a database, compiling reports, and saving and/or distributing exported files of those reports. Everything is running on a single server, but each worker is "assigned" to its own business unit with its own data, users, and tasks. The intent is that users will be able to use the web app to manage the scheduled jobs of their assigned worker. I'm aware of APScheduler's issue (at least in v3.x) with sharing jobstores, so instead of having Django modify the jobstores directly I'm planning on using ZMQ to send a JSON message containing instructions, which the worker will parse and execute itself.
Crux:
For the web user (assumed to have zero programming proficiency) to be able to add new scheduled jobs to the worker, the web app needs to provide a list of "possible tasks" that the worker can execute. Since the tasks are (primarily) reports to be produced, using the names of the reports is simplest for the user. So in the Django web page, the user will click an "Add New Job" button, which will open a form where they will select a report from a dropdown of all "possible tasks" and enter the job trigger information.
To make it functional on the worker side, I need to associate the string "report_name" with the worker function to execute it. There are potentially hundreds of different tasks (ie reports) that could be scheduled as jobs, but many can be handled by the same function given different arguments. For the example below, suppose there are 4 possible reports produced by two worker functions, like so:
func_1(A) -> report1A
func_1(B) -> report1B
func_2(A) -> report2A
func_2(B) -> report2B
So when the web user selects "Add Report1A daily at 8am", Django sends this (simplified mock) json message via ZMQ to the worker:
{"CMD" : "add_new_job", "TASK" : "report1A", "FREQ" : "daily0800"}
I want the worker to perform the following:
self.scheduler.add_job(func=func_1,trigger=<crontrigger>,args=['A'])
MY OPTIONS:
Now I know the simplest way of doing this would be a dict written into the worker, something like:
def task_lookup(report):
task_dict = {
'report1A' : [func_1, 'A'],
'report2B' : [func_2, 'B']
}
task = task_dict.get(report)
return task[0](task[1])
The reason I'm leaning away from that is because I need to provide the list of possible tasks to Django in the first place, for the user to select from. If I code this "task->func+args" mapping directly into the worker code, I'll have to add an additional step for Django to request the list of dict keys from the worker just to load the "Add New Job" form in the web app. Which isn't necessarily bad, but it feels like an unnecessary communication step.
What I'd like to do is store that mapping in a tiny sqlite table that would be accessible to both Django and the Worker. The table would be really simple, like:
Report | Function | Arguments |
---|---|---|
report1A | func_1 | A |
report2B | func_2 | B |
Where I got stuck was converting the string function name back into a reference to a callable function. I did find this SO link that I believe is doing what I'm trying to accomplish using getattr(module, func_name)
, but that question has 1 answer, 2 total votes, and is from 8 years ago, so I wanted to make sure it was kosher before I tried to follow suit. Is that process legit, and is that the best way to store a "reference" to a function in a database table?
Also:
I did find another source about storing functions in databases, but I'm fairly sure that I'm not trying to do whatever is being described here, which appears to be storing the full execution of the function in a way that would be callable directly within the database.