Maniac Docs
Inference

Retry and Fallback

Wrap models with RetryingModel for transient errors and FallbackModel for cross-provider failover.

Production deployments wrap raw adapters with resilience layers. Both compose: wrap each provider in RetryingModel, then chain them with FallbackModel.

import {
  OpenAICompatibleModel,
  AnthropicModel,
  RetryingModel,
  FallbackModel
} from "@maniac-ai/agents/inference/adapters";

const primary = new RetryingModel(
  new OpenAICompatibleModel({ slug: "gpt-4o" }),
  { retries: 3 }
);

const fallback = new RetryingModel(
  new AnthropicModel({ slug: "claude-sonnet-4-20250514" }),
  { retries: 2 }
);

const model = new FallbackModel(primary, [fallback]);

Use model on your Agent spec or Maniac({ model }) — the runner sees a single Model.

RetryingModel

Exponential-jitter retries on transient HTTP errors (429, 503, network failures):

import { RetryingModel, OpenAICompatibleModel } from "@maniac-ai/agents/inference/adapters";

const model = new RetryingModel(
  new OpenAICompatibleModel({ slug: "gpt-4o-mini" }),
  {
    retries: 3,              // additional attempts beyond the first (default 3)
    initialBackoff: 1,       // seconds
    maxBackoff: 60,
    multiplier: 2,
    jitter: 0.1,
    respectRetryAfter: true, // honour 429/503 Retry-After headers
    retryOn: (error) => isTransientHttpError(error)  // custom predicate
  }
);

Streaming semantics

Retries are bounded by the first yielded chunk. Once the consumer has observed any stream delta, restarting would duplicate tokens — post-first-chunk failures surface untouched.

Tracing

When a tracer is wired, each retry emits a retry trace event with attempt index, delay, and error summary.

FallbackModel

Cross-provider failover when the primary exhausts retries or raises a fallback-eligible error:

import { FallbackModel, isFallbackEligibleError } from "@maniac-ai/agents/inference/adapters";

const model = new FallbackModel(
  primaryModel,
  [secondaryModel, tertiaryModel],
  {
    fallbackOn: (error) => isFallbackEligibleError(error),  // default
    tracer: myTracer
  }
);

Default isFallbackEligibleError covers transient HTTP errors plus 401, 403, and 404.

Streaming semantics

Same as retries: locked to the first provider that yields a chunk. Post-first-chunk failures on the active provider are not silently switched.

Model catalog

FallbackModel.listModels() delegates to the primary model only. Merging catalogs across providers would produce ambiguous slugs.

Trace events

Fallback hops emit retry events with phase: "fallback" when a tracer is configured.

Predicate helpers

Exported from @maniac-ai/agents/inference/adapters:

HelperPurpose
isTransientHttpErrorDefault RetryingModel.retryOn predicate
isFallbackEligibleErrorDefault FallbackModel.fallbackOn predicate
parseRetryAfterParse Retry-After header values into seconds

Composition pattern

flowchart TD
  Agent["Agent.model"] --> FB["FallbackModel"]
  FB --> R1["RetryingModel (OpenAI)"]
  FB --> R2["RetryingModel (Anthropic)"]
  R1 --> OAI["OpenAICompatibleModel"]
  R2 --> ANT["AnthropicModel"]

Per-provider retries happen inside each chain link. FallbackModel advances only after a member's retry budget is exhausted.

On this page