Skip to content

Video Types

types

Video streaming types and data structures.

FrameCallback module-attribute

FrameCallback = Callable[[Frame], Frame | None]

StreamState

Bases: IntEnum

Video stream connection state.

DISCONNECTED class-attribute instance-attribute

DISCONNECTED = 0

CONNECTING class-attribute instance-attribute

CONNECTING = 1

CONNECTED class-attribute instance-attribute

CONNECTED = 2

STREAMING class-attribute instance-attribute

STREAMING = 3

ERROR class-attribute instance-attribute

ERROR = 4

STOPPED class-attribute instance-attribute

STOPPED = 5

StreamConfig dataclass

Video stream configuration.

drone_ip class-attribute instance-attribute

drone_ip: str = field(default_factory=lambda: drone_ip)

drone_id class-attribute instance-attribute

drone_id: int = 1

timeout class-attribute instance-attribute

timeout: float = field(default_factory=lambda: timeout_sec)

buffer_size class-attribute instance-attribute

buffer_size: int = field(default_factory=lambda: buffer_size)

rtp_port property

rtp_port: int

Calculate RTP port from drone ID.

generate_sdp

generate_sdp(local_ip: str = '0.0.0.0') -> str

Generate SDP content for RTP stream.

Parameters:

Name Type Description Default
local_ip str

Local IP address to receive stream on (default 0.0.0.0 for any)

'0.0.0.0'

Returns:

Type Description
str

SDP file content string

Note

Format based on C# Unity app's SDP generation, with required SDP header fields for FFmpeg/PyAV compatibility.

BoundingBox dataclass

Bounding box for detected object.

x instance-attribute

x: int

y instance-attribute

y: int

width instance-attribute

width: int

height instance-attribute

height: int

x2 property

x2: int

Bottom-right X coordinate.

y2 property

y2: int

Bottom-right Y coordinate.

center property

center: tuple[int, int]

Center point (x, y).

area property

area: int

Area in pixels.

to_tuple

to_tuple() -> tuple[int, int, int, int]

Return as (x, y, w, h) tuple.

to_xyxy

to_xyxy() -> tuple[int, int, int, int]

Return as (x1, y1, x2, y2) tuple.

Detection dataclass

Single object detection result.

label instance-attribute

label: str

confidence instance-attribute

confidence: float

bbox instance-attribute

class_id class-attribute instance-attribute

class_id: int | None = None

color class-attribute instance-attribute

color: tuple[int, int, int] = (0, 255, 0)

metadata class-attribute instance-attribute

metadata: dict = field(default_factory=dict)

Frame dataclass

Video frame with optional detections and metadata.

The frame flows through the callback pipeline, allowing each callback to add detections or modify the image.

image instance-attribute

image: NDArray[uint8]

timestamp class-attribute instance-attribute

timestamp: float = field(default_factory=time)

frame_number class-attribute instance-attribute

frame_number: int = 0

detections class-attribute instance-attribute

detections: list[Detection] = field(default_factory=list)

metadata class-attribute instance-attribute

metadata: dict = field(default_factory=dict)

shape property

shape: tuple[int, int, int]

Image shape (height, width, channels).

height property

height: int

Image height in pixels.

width property

width: int

Image width in pixels.

size property

size: tuple[int, int]

Image size as (width, height).

copy

copy() -> Frame

Create a deep copy of the frame.

draw_detections

draw_detections(thickness: int = 2, font_scale: float = 0.6) -> NDArray[uint8]

Draw detection boxes and labels on a copy of the image.

Parameters:

Name Type Description Default
thickness int

Line thickness for boxes

2
font_scale float

Font scale for labels

0.6

Returns:

Type Description
NDArray[uint8]

Annotated image copy

to_rgb

to_rgb() -> NDArray[uint8]

Convert BGR image to RGB.

to_jpeg

to_jpeg(quality: int = 85) -> bytes

Encode frame as JPEG bytes.

Parameters:

Name Type Description Default
quality int

JPEG quality (0-100)

85

Returns:

Type Description
bytes

JPEG encoded bytes