ractogateway.finetune.openai_tuner

OpenAI fine-tuning adapter for RactoGateway.

Workflow

  1. Build a RactoDataset.

  2. Call OpenAIFineTuner.run_pipeline() for a one-shot end-to-end run, or call the lower-level methods individually:

    1. upload_dataset()file_id

    2. create_job()job_id

    3. wait_for_completion()fine_tuned_model

Supported base models (as of 2025)

  • gpt-4o-mini-2024-07-18 — recommended; cost-effective

  • gpt-4o-2024-08-06 — multimodal vision fine-tuning

  • gpt-3.5-turbo-0125 — legacy option

class ractogateway.finetune.openai_tuner.OpenAIFineTuner(api_key=None, *, base_url=None)[source]

Bases: object

Fine-tune OpenAI models using the fine-tuning API.

Parameters:
  • api_key (str | None) – OpenAI API key. Falls back to the OPENAI_API_KEY environment variable when not supplied.

  • base_url (str | None) – Optional custom base URL (Azure OpenAI, proxy, etc.).

Examples

End-to-end pipeline (simplest usage):

from ractogateway.finetune import RactoDataset, OpenAIFineTuner

ds = RactoDataset.from_pairs(
    [("What is Python?", "A high-level programming language.")],
    system="You are a Python tutor.",
)
tuner = OpenAIFineTuner()
model = tuner.run_pipeline(ds, model="gpt-4o-mini-2024-07-18")
print(model)   # "ft:gpt-4o-mini-2024-07-18:org::abc123"
upload_dataset(dataset)[source]

Upload dataset as an OpenAI training file.

Parameters:

dataset (RactoDataset) – The training examples to upload.

Return type:

str

Returns:

str – The OpenAI file ID (e.g. "file-abc123").

create_job(training_file, model='gpt-4o-mini-2024-07-18', *, validation_file=None, n_epochs='auto', batch_size='auto', learning_rate_multiplier='auto', suffix=None)[source]

Submit a fine-tuning job.

Parameters:
  • training_file (str) – File ID returned by upload_dataset().

  • model (str) – Base model to fine-tune.

  • validation_file (str | None) – Optional validation file ID (also produced by upload_dataset()).

  • n_epochs (int | str) – Training epochs.

  • batch_size (int | str) – Per-device batch size.

  • learning_rate_multiplier (float | str) – Scales the default learning rate.

  • suffix (str | None) – Custom label appended to the fine-tuned model name.

Return type:

str

Returns:

str – The fine-tuning job ID (e.g. "ftjob-abc123").

get_status(job_id)[source]

Retrieve the current status of a fine-tuning job.

Return type:

dict[str, Any]

Returns:

dict – Keys: id, status, model, fine_tuned_model, created_at, finished_at, trained_tokens, error.

list_jobs(limit=10)[source]

Return the most recent fine-tuning jobs (newest first).

Return type:

list[dict[str, Any]]

list_events(job_id, limit=20)[source]

Return recent training log events for a job.

Return type:

list[dict[str, Any]]

cancel_job(job_id)[source]

Cancel a running fine-tuning job.

Return type:

dict[str, Any]

wait_for_completion(job_id, *, poll_interval=30, verbose=True)[source]

Block until a fine-tuning job finishes.

Parameters:
  • job_id (str) – The job ID returned by create_job().

  • poll_interval (int) – Seconds between status-check API calls.

  • verbose (bool) – Print status lines to stdout.

Return type:

str

Returns:

str – The fine-tuned model name ready for use in OpenAILLMKit.

Raises:

RuntimeError – If the job ends in "failed" or "cancelled" state.

run_pipeline(dataset, model='gpt-4o-mini-2024-07-18', *, validation_dataset=None, n_epochs='auto', batch_size='auto', learning_rate_multiplier='auto', suffix=None, poll_interval=30, verbose=True)[source]

Validate → upload → train → wait in a single call.

This is the recommended entry-point for most use cases.

Parameters:
  • dataset (RactoDataset) – Training examples.

  • model (str) – Base model to fine-tune.

  • validation_dataset (RactoDataset | None) – Optional held-out validation set (uploaded separately).

  • n_epochs (int | str) – Training hyperparameters. Pass "auto" to let OpenAI decide.

  • batch_size (int | str) – Training hyperparameters. Pass "auto" to let OpenAI decide.

  • learning_rate_multiplier (float | str) – Training hyperparameters. Pass "auto" to let OpenAI decide.

  • suffix (str | None) – Short label appended to the fine-tuned model name.

  • poll_interval (int) – Seconds between status polls while waiting.

  • verbose (bool) – Print progress to stdout.

Return type:

str

Returns:

str – Fine-tuned model identifier — pass directly to OpenAIDeveloperKit(model=...):

kit = opd.OpenAIDeveloperKit(model=fine_tuned_model)

Raises: