Video API¶
Package¶
video
¶
Video helpers for the drone API.
Optional backends are lazy-loaded so importing the core API does not require OpenCV, PyAV, ONNX Runtime, Flask, or YOLO dependencies.
Frame
dataclass
¶
Video frame with optional detections and metadata.
The frame flows through the callback pipeline, allowing each callback to add detections or modify the image.
detections
class-attribute
instance-attribute
¶
detections: list[Detection] = field(default_factory=list)
draw_detections
¶
Draw detection boxes and labels on a copy of the image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
thickness
|
int
|
Line thickness for boxes |
2
|
font_scale
|
float
|
Font scale for labels |
0.6
|
Returns:
| Type | Description |
|---|---|
NDArray[uint8]
|
Annotated image copy |
to_jpeg
¶
Encode frame as JPEG bytes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quality
|
int
|
JPEG quality (0-100) |
85
|
Returns:
| Type | Description |
|---|---|
bytes
|
JPEG encoded bytes |
Detection
dataclass
¶
Single object detection result.
BoundingBox
dataclass
¶
Bounding box for detected object.
StreamConfig
dataclass
¶
Video stream configuration.
drone_ip
class-attribute
instance-attribute
¶
timeout
class-attribute
instance-attribute
¶
buffer_size
class-attribute
instance-attribute
¶
generate_sdp
¶
Generate SDP content for RTP stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local_ip
|
str
|
Local IP address to receive stream on (default 0.0.0.0 for any) |
'0.0.0.0'
|
Returns:
| Type | Description |
|---|---|
str
|
SDP file content string |
Note
Format based on C# Unity app's SDP generation, with required SDP header fields for FFmpeg/PyAV compatibility.
StreamState
¶
Bases: IntEnum
Video stream connection state.
Streaming¶
stream
¶
Video stream decoder using PyAV for RTP/RTSP H.264 streams.
RTSPStream
¶
RTSP video stream decoder using PyAV.
For testing without a drone or connecting to any RTSP source.
Example:
stream = RTSPStream("rtsp://localhost:8554/stream")
def detect(frame):
frame.detections = model.detect(frame.image)
return frame
stream.add_callback(detect)
stream.add_callback(VideoDisplay())
stream.start()
stream.wait()
VideoStream
¶
RTP H.264 video stream decoder using PyAV.
Decodes drone video stream and passes frames through a callback pipeline. Supports multiple callbacks for detection, display, recording, etc.
Example:
stream = VideoStream(drone_ip="192.168.100.1")
# Add detection callback
def detect(frame):
frame.detections = my_model.detect(frame.image)
return frame
stream.add_callback(detect)
# Add display
stream.add_callback(display)
stream.start()
stream.wait() # Block until stopped
config
instance-attribute
¶
config = StreamConfig(drone_ip=drone_ip or drone_ip, drone_id=drone_id, timeout=timeout if timeout is not None else timeout_sec, buffer_size=buffer_size if buffer_size is not None else buffer_size)
add_callback
¶
add_callback(callback: FrameCallback) -> None
Add a frame processing callback.
Callbacks are executed in order. Each receives the frame (potentially modified by previous callbacks) and can:
- Add detections
- Modify the image
- Return modified Frame or None
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
FrameCallback
|
Function taking Frame, returning Frame or None |
required |
remove_callback
¶
remove_callback(callback: FrameCallback) -> bool
Remove a callback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
FrameCallback
|
Callback to remove |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if removed, False if not found |
get_buffered_frames
¶
get_buffered_frames(count: int | None = None) -> list[Frame]
Get frames from the buffer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
count
|
int | None
|
Number of frames to get (None for all) |
None
|
Returns:
| Type | Description |
|---|---|
list[Frame]
|
List of frames (oldest first) |
start
¶
Start the video stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
blocking
|
bool
|
If True, block until stream stops |
False
|
stop
¶
Stop the video stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float
|
Seconds to wait for thread to stop |
5.0
|
VideoStreamSimple
¶
Simplified video stream for quick testing.
Opens stream and yields frames directly without callbacks.
Example:
for frame in VideoStreamSimple("192.168.100.1"):
cv2.imshow("Video", frame.image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
config
instance-attribute
¶
config = StreamConfig(drone_ip=drone_ip or drone_ip, drone_id=drone_id, timeout=timeout if timeout is not None else timeout_sec)
Display¶
display
¶
OpenCV-based video display for drone stream.
VideoDisplay
¶
OpenCV window display as a frame callback.
Displays frames with optional detection overlays, FPS counter, and recording indicator.
Example:
stream = VideoStream(drone_ip="192.168.100.1")
display = VideoDisplay(window_name="Drone", show_fps=True)
stream.add_callback(display)
stream.start()
# Display handles its own window events
# Press 'q' to quit, 's' to screenshot
VideoDisplayAsync
¶
Async display that runs in its own thread.
Useful when you want display to run independently of the frame processing pipeline.
Example:
display = VideoDisplayAsync()
display.start()
for frame in stream:
detections = detector.detect(frame)
frame.detections = detections
display.update(frame) # Non-blocking
display.stop()
show_frame
¶
show_frame(frame: Frame, window_name: str = 'Frame', wait_key: int = 0, show_detections: bool = True) -> int
Quick utility to show a single frame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frame
|
Frame
|
Frame to display |
required |
window_name
|
str
|
Window name |
'Frame'
|
wait_key
|
int
|
cv2.waitKey argument (0 = wait forever) |
0
|
show_detections
|
bool
|
Draw detections if present |
True
|
Returns:
| Type | Description |
|---|---|
int
|
Key code pressed |
Recording¶
recording
¶
Video recording functionality using PyAV.
Provides callback-based recording for the VideoStream pipeline.
RecordingConfig
dataclass
¶
Configuration for video recording.
VideoRecorder
¶
Video recording callback for VideoStream.
Records frames to a video file using PyAV (FFmpeg). Can be used as a callback in the VideoStream pipeline.
Example:
stream = VideoStream(drone_ip="192.168.100.1")
# Basic recording
recorder = VideoRecorder("flight.mp4")
stream.add_callback(recorder)
stream.start()
# ... stream video ...
stream.stop()
recorder.close()
# With context manager
with VideoRecorder("flight.mp4") as recorder:
stream.add_callback(recorder)
stream.start()
stream.wait()
# File automatically finalized
# Record with detections drawn
recorder = VideoRecorder("flight_annotated.mp4", draw_detections=True)
config
instance-attribute
¶
config = RecordingConfig(output_path=str(output_path), codec=codec, fps=fps, bitrate=bitrate, preset=preset, crf=crf)
write_frame
¶
write_frame(frame: Frame) -> None
Manually write a frame (alternative to callback usage).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frame
|
Frame
|
Frame to record |
required |
close
¶
Finalize and close the video file.
Must be called when recording is complete to flush remaining frames and write file trailer.
SegmentedRecorder
¶
Records video in segments of fixed duration.
Useful for long recordings or when you want multiple smaller files instead of one large file.
Example:
# Record in 5-minute segments
recorder = SegmentedRecorder(
output_dir="recordings",
segment_duration=300, # 5 minutes
filename_pattern="flight_{timestamp}_{segment}.mp4"
)
stream.add_callback(recorder)
Detection¶
detection
¶
Object detection integration for video streaming.
Provides YOLO and other detector wrappers as VideoStream callbacks.
DetectorConfig
dataclass
¶
Configuration for object detectors.
BaseDetector
¶
Bases: ABC
Abstract base class for object detectors.
Subclass this to integrate custom detection models.
YOLODetector
¶
Bases: BaseDetector
YOLO object detector using ultralytics library.
Supports YOLOv8, YOLOv9, YOLOv10, YOLO11, and YOLO-World models.
Example:
from pypack.video import VideoStream, YOLODetector, VideoDisplay
# Basic usage
detector = YOLODetector("yolov8n.pt")
stream = VideoStream(drone_ip="192.168.100.1")
stream.add_callback(detector)
stream.add_callback(VideoDisplay())
stream.start()
# With custom configuration
detector = YOLODetector(
model_path="yolov8s.pt",
confidence=0.5,
classes=[0, 1, 2], # person, bicycle, car
device="cuda",
)
# Using YOLO-World for open vocabulary detection
detector = YOLODetector("yolov8s-world.pt")
detector.set_classes(["person", "drone", "car"])
set_classes
¶
Set classes for YOLO-World open vocabulary detection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
classes
|
list[str]
|
List of class names to detect |
required |
YOLOSegmentDetector
¶
Bases: YOLODetector
YOLO instance segmentation detector.
Returns detections with segmentation masks in metadata.
Example:
FilterDetector
¶
Bases: BaseDetector
Wrapper that filters detections from another detector.
Useful for filtering by class, confidence, size, or region.
Example:
DrawDetections
¶
DetectionLogger
¶
FrameCrop
¶
Callback that crops margins from video frames.
Use BEFORE detection to remove unwanted regions (e.g., propellers, sky, ground). Detection coordinates will be relative to the cropped frame.
Example:
SaveDetectionCrop
¶
Callback that saves cropped images of detected objects.
Use AFTER detection to save each detected object as a separate image file.
Example:
ONNXDetector
¶
Bases: BaseDetector
ONNX Runtime detector for YOLO-style models.
Handles models with output shape [1, 4+num_classes, num_boxes]. Applies NMS to filter detections.
Example:
from pypack.video import ONNXDetector, DrawDetections, VideoDisplay
detector = ONNXDetector(
model_path="model.onnx",
class_names=["House", "Tank", "Tree"],
confidence=0.3,
)
stream = drone.start_video_stream(display=False)
stream.add_callback(detector)
stream.add_callback(DrawDetections())
stream.add_callback(VideoDisplay())
Web¶
web
¶
Web streaming support for drone video.
Provides MJPEG streaming over HTTP for browser viewing. Can be used standalone or as a frame callback.
Requires Flask (optional dependency):
MJPEGStreamer
¶
MJPEG streamer as a frame callback.
Buffers frames and provides a generator for HTTP streaming. Compatible with Flask, FastAPI, or any WSGI/ASGI framework.
Example with Flask:
from flask import Flask, Response
from pypack.video import VideoStream, MJPEGStreamer
app = Flask(__name__)
streamer = MJPEGStreamer()
# Add to video stream
stream = VideoStream()
stream.add_callback(streamer)
stream.start()
@app.route('/video')
def video_feed():
return Response(
streamer.generate(),
mimetype='multipart/x-mixed-replace; boundary=frame'
)
app.run(host='0.0.0.0', port=5000)
WebStreamServer
¶
Standalone web server for video streaming.
Provides a simple Flask-based server for viewing the drone video stream in a web browser.
Example:
from pypack.video import VideoStream, WebStreamServer
stream = VideoStream()
# Start web server (runs in background)
server = WebStreamServer(stream, port=5000)
server.start()
print("Open http://localhost:5000 in browser")
stream.start(blocking=True)
start
¶
Start the web server.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
blocking
|
bool
|
If True, block until server stops |
False
|