Skip to main content

Provider Catalog

Provider catalog entries define upstream skins, credentials, model IDs, request dialects, tested capabilities, modalities, and request-time cost metadata. Callers request deployment-defined model groups; they do not need to know upstream provider model IDs.

This example is a partial subset of config.example.yaml; the shipped sample config is the source of truth for exact model IDs, pricing fields, source URLs, and update dates.

config.example.yaml
providers:
baseten:
base_url: https://inference.baseten.co/v1
dialect: openai-chat
api_key: ${BASETEN_API_KEY}
api_key_env: BASETEN_API_KEY
key_id: baseten-default
models:
gpt-oss-120b:
model: openai/gpt-oss-120b
tier: coding
input_price_per_million_usd: 0.1
output_price_per_million_usd: 0.5
input_modalities:
- text
output_modalities:
- text
tool_support:
openai_chat:
- tools
- tool_choice

Schema

Provider-level fields identify the upstream API skin:

  • base_url, dialect, auth_scheme, api_key_env, and headers control how the router calls upstream.
  • models.<ref>.model is the exact upstream model ID.
  • input_price_per_million_usd, output_price_per_million_usd, optional image pricing fields, pricing_source, pricing_updated_at, and pricing_notes are stored in config.example.yaml and copied into request usage rows at request time.
  • input_modalities and output_modalities describe validated I/O such as text, image, or video.
  • tool_support.openai_chat, tool_support.openai_responses, and tool_support.anthropic_messages are per-skin validation evidence, not marketing claims.
  • reasoning and honors_max_tokens further constrain eligible targets for reasoning requests or explicit caller output caps.

Internal vLLM, SGLang, Baseten, Crusoe, Fireworks, OpenRouter, Anthropic, MiniMax, Kimi, xAI, and OpenAI-compatible services all use this same catalog shape. Configure separate provider skins when the same upstream exposes multiple dialects, such as OpenAI Chat and OpenAI Responses.

Capability Declaration Rules

Capability fields are eligibility controls, not marketing descriptions. If a capability is omitted, the router treats it as unavailable and skips the target for requests that require it.

Use scripts/probe-model-capabilities.sh as the repeatable first pass for direct upstream evidence:

scripts/probe-model-capabilities.sh \
--base-url https://api.provider.example/v1 \
--model provider-model-id \
--api-key-env PROVIDER_API_KEY \
--dialect openai-chat \
--output yaml

Map the probe output mechanically:

ProbePassing metadataFailure or not tested
textcatalog the exact model ID with input_modalities: [text] and output_modalities: [text]do not activate the target
max-tokens-capomit honors_max_tokens because the default is trueset honors_max_tokens: false when the cap is not honored
auto-toolsadd tools, function, or client_tools for the tested skinomit tool support for that skin
forced-toolsadd tool_choice where the skin uses OpenAI-style tool choiceomit tool_choice
structured-outputsadd structured_outputs for the tested skinomit structured-output support
image-inputadd image to input_modalities after router-level image smoke also passeskeep text-only modalities
reasoning-effort or Anthropic thinkingadd reasoning.supported, mode, and the tested control typeomit reasoning metadata

Date-stamp the evidence in pricing_notes or validation notes. Include what passed, what failed, and what was not tested. A model passing one skin does not imply any other skin passed; validate OpenAI Chat, OpenAI Responses, and Anthropic Messages independently.

Rollback

If a cataloged model fails validation, remove it from every active models.<group>.targets[] entry first, keep the provider catalog entry only if it remains useful for smoke testing, and restart or reload through the normal deployment path. Remove capability fields such as structured_outputs, image, or tool_choice when only that request shape fails.

See Providers And Models, Add A Provider Or Model, Model Metadata, Self-Hosted Upstreams, and Router Configuration.