Redis
The redis module provides distributed, production-ready utilities that work across multiple server replicas — drop-in replacements or complements for the built-in in-process modules.
Installation
pip install ractogateway[redis]
Distributed Exact Cache
RedisExactCache is a drop-in for the in-process ExactMatchCache. Pass it to any developer kit’s exact_cache= parameter:
from ractogateway.redis import RedisExactCache
from ractogateway import openai_developer_kit as gpt
cache = RedisExactCache(url="redis://localhost:6379/0", ttl_seconds=3600)
kit = gpt.OpenAIDeveloperKit(model="gpt-4o", exact_cache=cache)
Rate Limiter
RedisRateLimiter enforces a sliding 1-minute token budget per user_id across all replicas:
from ractogateway.redis import RedisRateLimiter, RateLimitConfig
limiter = RedisRateLimiter(
url="redis://localhost:6379/0",
config=RateLimitConfig(max_tokens_per_minute=5_000),
)
estimated_tokens = 200
if not limiter.check_and_consume(user_id="u-42", tokens=estimated_tokens):
raise RuntimeError("Rate limit exceeded — try again later.")
remaining = limiter.get_remaining("u-42")
Chat Memory
RedisChatMemory stores bounded conversation history in a Redis List so conversations survive rolling deployments:
from ractogateway.redis import RedisChatMemory, ChatMemoryConfig
memory = RedisChatMemory(
url="redis://localhost:6379/0",
config=ChatMemoryConfig(max_turns=20, ttl_seconds=1800),
)
conv_id = "session-abc"
memory.append(conv_id, "user", "Hello!")
memory.append(conv_id, "assistant", "Hi, how can I help?")
history = memory.get_history(conv_id)
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi, …"}]
memory.clear(conv_id)
Using a Pre-Built Redis Client
All three classes also accept a pre-built redis.Redis instance via the client= parameter instead of a URL:
import redis as _redis
client = _redis.Redis.from_url("redis://localhost:6379/0")
cache = RedisExactCache(client=client, ttl_seconds=600)