ractogateway.pipelines.video_processor._transcriber

Audio extraction and transcription for VideoProcessorPipeline.

Supports 9 transcription backends via a pluggable BaseTranscriber:

Local / open-source:

faster-whisper — faster-whisper library (default) openai-whisper — openai-whisper library huggingface-local — HuggingFace transformers ASR pipeline

Cloud APIs:

openai-api — OpenAI Whisper API google-api — Google Cloud Speech-to-Text v2 huggingface-api — HuggingFace Inference API groq-api — Groq Whisper (ultra-fast cloud) deepgram-api — Deepgram Nova

Self-hosted:

ollama — Ollama server (audio-capable models)

ractogateway.pipelines.video_processor._transcriber.extract_audio(video_path, *, start_time_seconds=None, end_time_seconds=None)[source]

Extract audio from video_path to a WAV temp file via ffmpeg-python.

When start/end bounds are provided, only that time window is extracted.

Return type:

Path

ractogateway.pipelines.video_processor._transcriber.get_audio_duration(audio_path)[source]

Return audio duration in seconds using ffmpeg probe.

Return type:

float

ractogateway.pipelines.video_processor._transcriber.align_frames_to_transcript(frames, segments)[source]

Assign frame IDs to transcript segments by timestamp overlap.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.BaseTranscriber[source]

Bases: ABC

Abstract interface for all transcription backends.

abstractmethod transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.FasterWhisperTranscriber(model_size='base')[source]

Bases: BaseTranscriber

Local transcription using the faster-whisper library.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.OpenAIWhisperTranscriber(model_size='base')[source]

Bases: BaseTranscriber

Local transcription using the openai-whisper library.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.HuggingFaceLocalTranscriber(model_id='openai/whisper-base')[source]

Bases: BaseTranscriber

Local ASR transcription via HuggingFace transformers pipeline.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.OpenAIAPITranscriber(model='whisper-1', api_key=None)[source]

Bases: BaseTranscriber

Cloud transcription via OpenAI Whisper API.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.GoogleAPITranscriber(model='long', api_key=None)[source]

Bases: BaseTranscriber

Cloud transcription via Google Cloud Speech-to-Text v2.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.HuggingFaceAPITranscriber(model_id='openai/whisper-large-v3', api_key=None)[source]

Bases: BaseTranscriber

Cloud transcription via HuggingFace Inference API.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.GroqTranscriber(model='whisper-large-v3', api_key=None)[source]

Bases: BaseTranscriber

Cloud transcription via Groq Whisper API (ultra-fast).

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.DeepgramTranscriber(model='nova-3', api_key=None)[source]

Bases: BaseTranscriber

Cloud transcription via Deepgram Nova.

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

class ractogateway.pipelines.video_processor._transcriber.OllamaTranscriber(model='whisper', base_url=None)[source]

Bases: BaseTranscriber

Self-hosted transcription via Ollama server (audio-capable models).

transcribe(audio_path, language)[source]

Transcribe audio file and return time-stamped segments.

Return type:

list[TranscriptSegment]

ractogateway.pipelines.video_processor._transcriber.get_transcriber(backend, model, api_key, base_url)[source]

Return the concrete BaseTranscriber for the given backend.

Return type:

BaseTranscriber