ractogateway.pipelines.video_processor._models

Models for VideoProcessorPipeline.

class ractogateway.pipelines.video_processor._models.DeduplicationMethod(*values)[source]

Bases: str, Enum

Frame similarity algorithm used for deduplication.

PHASH = 'phash'

SSIM = 'ssim'

class ractogateway.pipelines.video_processor._models.FrameAnalysisMode(*values)[source]

Bases: str, Enum

How frames are sent to the vision LLM.

INDIVIDUAL = 'individual'

GRID = 'grid'

class ractogateway.pipelines.video_processor._models.VideoProcessingMode(*values)[source]

Bases: str, Enum

How much of the video should be processed.

ACTIVE = 'active'

PASSIVE = 'passive'

class ractogateway.pipelines.video_processor._models.TranscriberBackend(*values)[source]

Bases: str, Enum

Audio transcription backend.

FASTER_WHISPER = 'faster-whisper'

OPENAI_WHISPER = 'openai-whisper'

HUGGINGFACE_LOCAL = 'huggingface-local'

OPENAI_API = 'openai-api'

GOOGLE_API = 'google-api'

HUGGINGFACE_API = 'huggingface-api'

GROQ_API = 'groq-api'

DEEPGRAM_API = 'deepgram-api'

OLLAMA = 'ollama'

class ractogateway.pipelines.video_processor._models.VideoConfig(**data)[source]

Bases: BaseModel

All tunable hyperparameters for VideoProcessorPipeline.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

fps: float: Frames to sample per second of video.

similarity_threshold: float: Discard a frame whose similarity to the previous kept frame is >= this %. Lower = keep more frames. Range 0-100.

max_frames: int | None: Hard cap on frames kept (None = no cap).

dedup_method: DeduplicationMethod: Algorithm used to compare frame similarity.

frame_format: str

‘JPEG’ (smaller) or ‘PNG’ (lossless).

Type:: Image format for kept frames

analyze_frames: bool: Pass kept frames to the vision LLM for content extraction.

frame_analysis_mode: FrameAnalysisMode: Individual = one LLM call per frame; Grid = stitch frames into a collage.

grid_size: int: Number of frames per grid collage (used when frame_analysis_mode=’grid’).

batch_size: int: How many frames to submit to the LLM concurrently per batch.

max_workers: int: Thread-pool size for concurrent LLM frame analysis calls.

max_process_workers: int: Process-pool size for CPU-bound frame extraction / hashing.

transcribe_audio: bool: Extract and transcribe the video’s audio track.

transcriber_backend: TranscriberBackend: Which transcription engine to use.

transcriber_model: str

Model name / size — interpretation is backend-specific.

Examples:: faster-whisper / openai-whisper : “tiny” “base” “small” “medium” “large-v3” huggingface-local / -api : HF model ID e.g. “openai/whisper-large-v3” openai-api : “whisper-1” google-api : “long” “short” “latest_long” groq-api : “whisper-large-v3” “whisper-large-v3-turbo” deepgram-api : “nova-3” “nova-2” “enhanced” “base” ollama : model name on server e.g. “whisper”

transcriber_api_key: str | None: API key for cloud transcription backends (falls back to env vars).

transcriber_base_url: str | None: Base URL for self-hosted / Ollama transcription endpoints.

language: str | None: BCP-47 language code (e.g. ‘en’, ‘fr’). None = auto-detect.

generate_summary: bool: Generate a comprehensive textual summary at the end.

store_in_rag: bool: Push all extracted content into the supplied rag_pipeline for Q&A.

processing_mode: VideoProcessingMode: active processes full video; passive processes only a time window.

focus_time_seconds: float | None

10).

Type:: Center timestamp in seconds for passive mode (e.g. 130 for 02

window_seconds: float: Passive-mode half-window size in seconds (focus ± window_seconds).

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.video_processor._models.FrameEntry(**data)[source]

Bases: BaseModel

One video frame, after extraction and optional analysis.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

frame_id: int: Zero-based sequential frame identifier.

timestamp: float: Position in the video in seconds.

similarity_to_prev: float | None: Similarity percentage to the previous kept frame (None for first frame).

kept: bool: False if discarded by the deduplication step.

analysis: str | None: LLM-generated description of visual content (whiteboard, screen, etc.).

image_data: bytes | None: Raw image bytes for kept + analyzed frames.

image_format: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.video_processor._models.TranscriptSegment(**data)[source]

Bases: BaseModel

A time-bounded transcription segment aligned to frame IDs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

start: float: Segment start time in seconds.

end: float: Segment end time in seconds.

text: str: Transcribed text for this segment.

frame_ids: list[int]: IDs of kept frames whose timestamps fall within [start, end].

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.video_processor._models.VideoSection(**data)[source]

Bases: BaseModel

A merged time section combining visual analysis + audio transcript.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

timestamp_start: float

timestamp_end: float

frame_ids: list[int]

visual_content: str: Combined LLM analyses for all frames in this section.

audio_content: str: Concatenated transcript text for this section’s time range.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.video_processor._models.VideoProcessorUsage(**data)[source]

Bases: BaseModel

Accounting of tokens and frame counts across the full pipeline.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

frames_extracted: int

frames_kept: int

frames_discarded: int

analysis_input_tokens: int

analysis_output_tokens: int

summary_input_tokens: int

summary_output_tokens: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

audio_duration_seconds: float

property total_analysis_tokens: int

property total_summary_tokens: int

property total_tokens: int

class ractogateway.pipelines.video_processor._models.StageError(**data)[source]

Bases: BaseModel

Structured record of a failure in one pipeline stage.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

stage: str: Name of the pipeline stage that failed (e.g. ‘extract’, ‘transcribe’).

error_type: str: Exception class name (e.g. ‘ImportError’, ‘RuntimeError’).

message: str: str(exc) — the error message.

traceback: str | None: Full Python traceback as a string (available in safe_mode).

class ractogateway.pipelines.video_processor._models.VideoProcessorResult(**data)[source]

Bases: BaseModel

Full output of a VideoProcessorPipeline run.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

video_path: str: Original source identifier (path, URL, or ‘<bytes>’ for buffer input).

frames: list[FrameEntry]: All extracted frames (kept and discarded).

transcript: list[TranscriptSegment]: Audio transcript segmented by timestamp.

sections: list[VideoSection]: Merged visual + audio sections ordered by time.

summary: str | None: Comprehensive LLM-generated summary of the entire video.

rag_stored: bool

rag_chunk_count: int

usage: VideoProcessorUsage

error: str | None: Short description of the first fatal error (backward-compatible).

failed_stage: str | None: Name of the stage that caused a fatal pipeline abort, if any.

stage_errors: list[StageError]: All per-stage errors collected during the run (fatal + non-fatal).

processing_mode: VideoProcessingMode: Whether this run processed full video (active) or a window (passive).

window_start_seconds: float | None: Passive-mode window start timestamp in source-video seconds.

window_end_seconds: float | None: Passive-mode window end timestamp in source-video seconds.

question: str | None: Optional user question answered from this run.

answer: str | None: Answer generated for question, when question-answer mode is used.

property has_errors: bool: True if any stage encountered an error.

property is_failed: bool: True if the pipeline aborted early due to a fatal stage error.

get_transcript_text()[source]

Full transcript as a single string.

Return type:: str

get_all_visual_content()[source]

All frame analyses concatenated in timestamp order.

Return type:: str

to_json(path=None, *, indent=2)[source]

Serialise result to JSON. Returns JSON string if path is None.

Return type:: str | None

to_markdown(path=None)[source]

Build a structured Markdown report. Returns string if path is None.

Return type:: str | None

exception ractogateway.pipelines.video_processor._models.VideoRateLimitExceededError[source]

Bases: RuntimeError

Raised when a rate_limiter denies a VideoProcessorPipeline request.