ractogateway.ollama_developer_kit.server

OllamaServerManager — start and stop an Ollama server from Python.

Useful when you want to control the Ollama process lifecycle (custom port, programmatic startup/shutdown) directly from your application code.

Usage:

from ractogateway import ollama_developer_kit as local

with local.OllamaServerManager(port=11500) as srv:
    kit = local.Chat(model="llama3.2", base_url=srv.base_url)
    response = kit.chat(local.ChatConfig(user_message="Hello!"))
    print(response.content)

Or manually:

srv = OllamaServerManager(port=11500)
srv.start()
...
srv.stop()

class ractogateway.ollama_developer_kit.server.OllamaServerManager(*, host='127.0.0.1', port=11434, startup_timeout=30.0, ollama_bin='ollama')[source]

Bases: object

Manage the lifecycle of an Ollama server subprocess.

The server is started with the OLLAMA_HOST environment variable set to {host}:{port}, which makes Ollama listen on the requested address.

Parameters:

host (str) – Bind address. Defaults to "127.0.0.1" (localhost only).
port (int) – TCP port for the Ollama REST API. Defaults to 11434 (the standard Ollama port). Change this to run multiple Ollama instances or avoid conflicts with an already-running server.
startup_timeout (float) – Seconds to wait for the server to become ready after starting the subprocess. Raises TimeoutError if the server doesn’t respond within this window.
ollama_bin (str) – Path to the ollama executable. Defaults to "ollama" (looked up via PATH).

base_url

The full http://{host}:{port} URL of the managed server. Use this to construct a OllamaDeveloperKit:

kit = local.Chat(model="llama3.2", base_url=srv.base_url)

Type:: str

Examples

Context manager (recommended — guarantees cleanup):

with OllamaServerManager(port=11500) as srv:
    kit = local.Chat(model="llama3.2", base_url=srv.base_url)
    print(kit.chat(local.ChatConfig(user_message="Hi")).content)

Manual start / stop:

srv = OllamaServerManager(port=11500)
srv.start()
try:
    ...
finally:
    srv.stop()

property base_url: str: Return http://{host}:{port}.

property is_running: bool: True when the subprocess is alive.

start()[source]

Start the Ollama server subprocess.

Returns self so that the call can be chained:

srv = OllamaServerManager(port=11500).start()

Raises:

RuntimeError – If the server is already running.
FileNotFoundError – If the ollama binary cannot be found.
TimeoutError – If the server does not become ready within startup_timeout seconds.

Return type:

OllamaServerManager

stop()[source]

Stop the Ollama server subprocess gracefully.

Sends SIGTERM first; if the process doesn’t exit within 5 seconds, SIGKILL is used. Silently does nothing if the server is not running.

Return type:: None

pull(model)[source]

Pull model from the Ollama library into the running server.

Equivalent to running ollama pull <model> in a shell, but scoped to the server managed by this instance via OLLAMA_HOST.

Parameters:

model (str) – Model name, e.g. "llama3.2", "nomic-embed-text".

Raises:

RuntimeError – If the server is not running.
subprocess.CalledProcessError – If ollama pull exits with a non-zero status.

Return type:

None

list_models()[source]

Return the names of locally-available models on this server.

Uses a lightweight HTTP request to the /api/tags endpoint instead of a subprocess so it works even when ollama CLI is not on PATH.

Return type:: list[str]
Returns:: list[str] – Model names (e.g. ["llama3.2:latest", "nomic-embed-text:latest"]).