Live RTSP Camera Motion Capture & AI Object Recognition
Overview
A Python system that pulls motion events from TP-Link Tapo ONVIF cameras, records the RTSP stream on motion, and runs AI object recognition to produce annotated snapshots with labelled bounding boxes. Built around the open-source peterstamps project and evaluated/run on a Raspberry Pi 4, including a higher-performance multiprocessing variant.
The Challenge
Reliable smart-camera capture on constrained edge hardware is hard: you must subscribe to ONVIF motion events, keep a rolling buffer of pre-motion frames, record RTSP without running out of memory, and optionally call an object-detection model, all on a fanless Raspberry Pi without dropping frames.
What We Built
Multiple runnable Python programs explore the design space: myMPTapoDetectCaptureVideo.py (preferred multiprocessing version) and myTapoDetectCaptureVideo.py combine ONVIF motion detection (myTapoMotionDetection.py) with RTSP capture (myTapoVideoCapture.py), driven by config modules (myTapoMotionConfig.py / myMPTapoMotionConfig.py). The capture loop maintains a frame deque as a pre-record buffer, flushing first-in-first-out and watching a memory-full percentage to avoid OOM. When enabled, frames are sent to an AI object server that returns compact JPEGs with detected objects marked and labelled. Sensitivity adapts automatically across dusk/dawn between day and night thresholds. ONVIF is handled via onvif-zeep-async/zeep with downloaded WSDL files.
Technologies & Approach
Python with asyncio and the ONVIF SOAP/zeep stack for event subscription, RTSP for live video, and an external AI object-recognition server for detection. A multiprocessing variant offloads work for smoother capture on the Pi. Configuration-first design makes thresholds, paths, and the AI toggle easy to tune per hardware.
Outcome / Impact
A working, tested smart-camera pipeline (validated on a Tapo C225 and Raspberry Pi 4) proving the studio can integrate IP cameras end to end, ONVIF events, RTSP recording, memory-safe buffering, and edge AI object detection. Positioned as evaluation and operation of an open-source project, with adaptation to real hardware constraints.
Capabilities Demonstrated
- ONVIF motion-event subscription and camera integration
- RTSP live-stream capture with memory-safe pre-record buffering
- Motion-triggered recording and snapshot generation
- AI object detection with labelled bounding boxes at the edge
- Adaptive day/night sensitivity and multiprocessing performance tuning
- Deployment on constrained edge hardware (Raspberry Pi)