How should I approach in developing this computer vision project? Dog Pooping/Elimination/Urination Detector w/ alerts through RTSP camera, Web App

I am currently developing a project which aims to develop a Dog Pooping/Elimination/Urination Detector using YOLOv8, with sound alerts to deter from designated spots in which the camera's vision range allows. This application would be viewable through a web page w/ a login portal.

Stack: Python, Django, YOLOv8, OpenCV, HTML, CSS, JS.

I'm having a number of problems first is I'm not sure whether I'm using the right model for this project, I'm currently doing transfer learning using yolov8s.pt and training on a dataset with 1 class, pooping dogs.

Here's how it looks:

results = model.train(data="config.yaml", epochs=100, patience=15, batch=4, workers=6, device=[0])

However, over numerous attempts and curating the dataset I seem to always get no detections at all, and when I do get detections they're always of false positives. Could it be that object detection is the wrong choice, would pose estimation be better given the goal of the project?

Second, I can't find a way to make the RTSP camera be publicly available, I tried looking up port-forwarding but it wasn't possible to the limitations of my router and my ISP.

And lastly, aside from a dog pooping/elimination/urination, I would also want the system to be able to identify and differentiate each dog that is visible in the camera. Is it possible to train YOLOv8 on both object detection/pose(if pose estimation is better for the project) and segmentation?

Вернуться на верх