Retry and Fallback
Wrap models with RetryingModel for transient errors and FallbackModel for cross-provider failover.
Production deployments wrap raw adapters with resilience layers. Both compose: wrap each provider in RetryingModel, then chain them with FallbackModel.
import {
OpenAICompatibleModel,
AnthropicModel,
RetryingModel,
FallbackModel
} from "@maniac-ai/agents/inference/adapters";
const primary = new RetryingModel(
new OpenAICompatibleModel({ slug: "gpt-4o" }),
{ retries: 3 }
);
const fallback = new RetryingModel(
new AnthropicModel({ slug: "claude-sonnet-4-20250514" }),
{ retries: 2 }
);
const model = new FallbackModel(primary, [fallback]);Use model on your Agent spec or Maniac({ model }) — the runner sees a single Model.
RetryingModel
Exponential-jitter retries on transient HTTP errors (429, 503, network failures):
import { RetryingModel, OpenAICompatibleModel } from "@maniac-ai/agents/inference/adapters";
const model = new RetryingModel(
new OpenAICompatibleModel({ slug: "gpt-4o-mini" }),
{
retries: 3, // additional attempts beyond the first (default 3)
initialBackoff: 1, // seconds
maxBackoff: 60,
multiplier: 2,
jitter: 0.1,
respectRetryAfter: true, // honour 429/503 Retry-After headers
retryOn: (error) => isTransientHttpError(error) // custom predicate
}
);Streaming semantics
Retries are bounded by the first yielded chunk. Once the consumer has observed any stream delta, restarting would duplicate tokens — post-first-chunk failures surface untouched.
Tracing
When a tracer is wired, each retry emits a retry trace event with attempt index, delay, and error summary.
FallbackModel
Cross-provider failover when the primary exhausts retries or raises a fallback-eligible error:
import { FallbackModel, isFallbackEligibleError } from "@maniac-ai/agents/inference/adapters";
const model = new FallbackModel(
primaryModel,
[secondaryModel, tertiaryModel],
{
fallbackOn: (error) => isFallbackEligibleError(error), // default
tracer: myTracer
}
);Default isFallbackEligibleError covers transient HTTP errors plus 401, 403, and 404.
Streaming semantics
Same as retries: locked to the first provider that yields a chunk. Post-first-chunk failures on the active provider are not silently switched.
Model catalog
FallbackModel.listModels() delegates to the primary model only. Merging catalogs across providers would produce ambiguous slugs.
Trace events
Fallback hops emit retry events with phase: "fallback" when a tracer is configured.
Predicate helpers
Exported from @maniac-ai/agents/inference/adapters:
| Helper | Purpose |
|---|---|
isTransientHttpError | Default RetryingModel.retryOn predicate |
isFallbackEligibleError | Default FallbackModel.fallbackOn predicate |
parseRetryAfter | Parse Retry-After header values into seconds |
Composition pattern
flowchart TD
Agent["Agent.model"] --> FB["FallbackModel"]
FB --> R1["RetryingModel (OpenAI)"]
FB --> R2["RetryingModel (Anthropic)"]
R1 --> OAI["OpenAICompatibleModel"]
R2 --> ANT["AnthropicModel"]Per-provider retries happen inside each chain link. FallbackModel advances only after a member's retry budget is exhausted.