Is it possible to terminate an AI training process in django?
I am developing a webpage using Django. In the admin site, there is a section called: Train Model, in here, there is a button called "Train Model" this button runs a python script that has a simulation of a training process, it is just a sleep(30) for now, while the sleeping time is happening the user should be able no cancel this process, in case the user forgot something, is it possible to do so? I have tried some things, but the logs show that even if I press the cancel button, the "training" process runs until it is completed.
import time
import multiprocessing
import os
import signal
import logging
# Global variables to manage the training state
is_training = False
training_process = None
training_pid = None # Store the PID of the training process
stop_event = None
logger = logging.getLogger()
def train_task(stop_event):
"""simulates the training task with stop event support"""
for i in range(30):
if stop_event.is_set():
print("Training cancelled.")
return
time.sleep(1)
print("Training done.")
def train_model():
global is_training, training_process, stop_event
# If training is already in progress, return immediately
if is_training:
print("Training already in progress...")
return "Training already in progress"
#mark the start of the training
is_training = True
stop_event = multiprocessing.Event() #creates a stop event
print("Training started...")
#start the training process with stop event support
training_process = multiprocessing.Process(target = train_task, args=(stop_event,))
training_process.start()
#stores the pid of the process
training_pid = training_process.pid
return "Training started."
def cancel_training_process():
global training_pid, is_training, stop_event
if is_training and training_pid:
try:
os.kill(training_pid, signal.SIGTERM) #sends the SIGTERM signal to the process
is_training = False #resets the training state
training_pid = None
stop_event.set() # stops the training event
return "Training cancelled."
except OSError as e:
return f"Error while cancelling the process: {e}"
return "No training in progress"
def check_training_status():
global is_training
if is_training:
return "Training in progress"
else:
return "No training in progress"
Those are the functions that simulates the training and try to cancel the process.
Any advice in this regard? In future, the training will be a whole different script that will grab data from a data base and it will train an AI, but such a script is not yet written, this is just me trying to implement what I was requested to do hehehe
I tried different solutions, none of the seem to be working, when I want to cancel the process it still runs until the end, I am sure there might be a solution, but this is definetely out of my knowledge for now