ractogateway.pipelines.video_processor._models
Models for VideoProcessorPipeline.
- class ractogateway.pipelines.video_processor._models.DeduplicationMethod(*values)[source]
-
Frame similarity algorithm used for deduplication.
- PHASH = 'phash'
- SSIM = 'ssim'
- class ractogateway.pipelines.video_processor._models.FrameAnalysisMode(*values)[source]
-
How frames are sent to the vision LLM.
- INDIVIDUAL = 'individual'
- GRID = 'grid'
- class ractogateway.pipelines.video_processor._models.VideoProcessingMode(*values)[source]
-
How much of the video should be processed.
- ACTIVE = 'active'
- PASSIVE = 'passive'
- class ractogateway.pipelines.video_processor._models.TranscriberBackend(*values)[source]
-
Audio transcription backend.
- FASTER_WHISPER = 'faster-whisper'
- OPENAI_WHISPER = 'openai-whisper'
- HUGGINGFACE_LOCAL = 'huggingface-local'
- OPENAI_API = 'openai-api'
- GOOGLE_API = 'google-api'
- HUGGINGFACE_API = 'huggingface-api'
- GROQ_API = 'groq-api'
- DEEPGRAM_API = 'deepgram-api'
- OLLAMA = 'ollama'
- class ractogateway.pipelines.video_processor._models.VideoConfig(**data)[source]
Bases:
BaseModelAll tunable hyperparameters for VideoProcessorPipeline.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- fps: float
Frames to sample per second of video.
- similarity_threshold: float
Discard a frame whose similarity to the previous kept frame is >= this %. Lower = keep more frames. Range 0-100.
- dedup_method: DeduplicationMethod
Algorithm used to compare frame similarity.
- frame_format: str
‘JPEG’ (smaller) or ‘PNG’ (lossless).
- Type:
Image format for kept frames
- analyze_frames: bool
Pass kept frames to the vision LLM for content extraction.
- frame_analysis_mode: FrameAnalysisMode
Individual = one LLM call per frame; Grid = stitch frames into a collage.
- grid_size: int
Number of frames per grid collage (used when frame_analysis_mode=’grid’).
- batch_size: int
How many frames to submit to the LLM concurrently per batch.
- max_workers: int
Thread-pool size for concurrent LLM frame analysis calls.
- max_process_workers: int
Process-pool size for CPU-bound frame extraction / hashing.
- transcribe_audio: bool
Extract and transcribe the video’s audio track.
- transcriber_backend: TranscriberBackend
Which transcription engine to use.
- transcriber_model: str
Model name / size — interpretation is backend-specific.
- Examples:
faster-whisper / openai-whisper : “tiny” “base” “small” “medium” “large-v3” huggingface-local / -api : HF model ID e.g. “openai/whisper-large-v3” openai-api : “whisper-1” google-api : “long” “short” “latest_long” groq-api : “whisper-large-v3” “whisper-large-v3-turbo” deepgram-api : “nova-3” “nova-2” “enhanced” “base” ollama : model name on server e.g. “whisper”
- generate_summary: bool
Generate a comprehensive textual summary at the end.
- store_in_rag: bool
Push all extracted content into the supplied rag_pipeline for Q&A.
- processing_mode: VideoProcessingMode
active processes full video; passive processes only a time window.
- focus_time_seconds: float | None
10).
- Type:
Center timestamp in seconds for passive mode (e.g. 130 for 02
- window_seconds: float
Passive-mode half-window size in seconds (focus ± window_seconds).
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.video_processor._models.FrameEntry(**data)[source]
Bases:
BaseModelOne video frame, after extraction and optional analysis.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- frame_id: int
Zero-based sequential frame identifier.
- timestamp: float
Position in the video in seconds.
- similarity_to_prev: float | None
Similarity percentage to the previous kept frame (None for first frame).
- kept: bool
False if discarded by the deduplication step.
- image_format: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.video_processor._models.TranscriptSegment(**data)[source]
Bases:
BaseModelA time-bounded transcription segment aligned to frame IDs.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- start: float
Segment start time in seconds.
- end: float
Segment end time in seconds.
- text: str
Transcribed text for this segment.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.video_processor._models.VideoSection(**data)[source]
Bases:
BaseModelA merged time section combining visual analysis + audio transcript.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- timestamp_start: float
- timestamp_end: float
- visual_content: str
Combined LLM analyses for all frames in this section.
- audio_content: str
Concatenated transcript text for this section’s time range.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.video_processor._models.VideoProcessorUsage(**data)[source]
Bases:
BaseModelAccounting of tokens and frame counts across the full pipeline.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- frames_extracted: int
- frames_kept: int
- frames_discarded: int
- analysis_input_tokens: int
- analysis_output_tokens: int
- summary_input_tokens: int
- summary_output_tokens: int
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- audio_duration_seconds: float
- property total_analysis_tokens: int
- property total_summary_tokens: int
- property total_tokens: int
- class ractogateway.pipelines.video_processor._models.StageError(**data)[source]
Bases:
BaseModelStructured record of a failure in one pipeline stage.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- stage: str
Name of the pipeline stage that failed (e.g. ‘extract’, ‘transcribe’).
- error_type: str
Exception class name (e.g. ‘ImportError’, ‘RuntimeError’).
- message: str
str(exc) — the error message.
- class ractogateway.pipelines.video_processor._models.VideoProcessorResult(**data)[source]
Bases:
BaseModelFull output of a VideoProcessorPipeline run.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- video_path: str
Original source identifier (path, URL, or ‘<bytes>’ for buffer input).
- frames: list[FrameEntry]
All extracted frames (kept and discarded).
- transcript: list[TranscriptSegment]
Audio transcript segmented by timestamp.
- sections: list[VideoSection]
Merged visual + audio sections ordered by time.
- rag_stored: bool
- rag_chunk_count: int
- usage: VideoProcessorUsage
- stage_errors: list[StageError]
All per-stage errors collected during the run (fatal + non-fatal).
- processing_mode: VideoProcessingMode
Whether this run processed full video (active) or a window (passive).
- property has_errors: bool
True if any stage encountered an error.
- property is_failed: bool
True if the pipeline aborted early due to a fatal stage error.
- get_all_visual_content()[source]
All frame analyses concatenated in timestamp order.
- Return type:
- to_json(path=None, *, indent=2)[source]
Serialise result to JSON. Returns JSON string if path is None.
- exception ractogateway.pipelines.video_processor._models.VideoRateLimitExceededError[source]
Bases:
RuntimeErrorRaised when a rate_limiter denies a VideoProcessorPipeline request.