Provider Catalog
Provider catalog entries define upstream skins, credentials, model IDs, request dialects, tested capabilities, modalities, and request-time cost metadata. Callers request deployment-defined model groups; they do not need to know upstream provider model IDs.
This example is a partial subset of config.example.yaml; the shipped sample config is the source of truth for exact model IDs, pricing fields, source URLs, and update dates.
providers:
baseten:
base_url: https://inference.baseten.co/v1
dialect: openai-chat
api_key: ${BASETEN_API_KEY}
api_key_env: BASETEN_API_KEY
key_id: baseten-default
models:
gpt-oss-120b:
model: openai/gpt-oss-120b
tier: coding
input_price_per_million_usd: 0.1
output_price_per_million_usd: 0.5
input_modalities:
- text
output_modalities:
- text
tool_support:
openai_chat:
- tools
- tool_choice
Schema
Provider-level fields identify the upstream API skin:
base_url,dialect,auth_scheme,api_key_env, andheaderscontrol how the router calls upstream.models.<ref>.modelis the exact upstream model ID.input_price_per_million_usd,output_price_per_million_usd, optional image pricing fields,pricing_source,pricing_updated_at, andpricing_notesare stored inconfig.example.yamland copied into request usage rows at request time.input_modalitiesandoutput_modalitiesdescribe validated I/O such astext,image, orvideo.tool_support.openai_chat,tool_support.openai_responses, andtool_support.anthropic_messagesare per-skin validation evidence, not marketing claims.reasoningandhonors_max_tokensfurther constrain eligible targets for reasoning requests or explicit caller output caps.
Internal vLLM, SGLang, Baseten, Crusoe, Fireworks, OpenRouter, Anthropic, MiniMax, Kimi, xAI, and OpenAI-compatible services all use this same catalog shape. Configure separate provider skins when the same upstream exposes multiple dialects, such as OpenAI Chat and OpenAI Responses.
Capability Declaration Rules
Capability fields are eligibility controls, not marketing descriptions. If a capability is omitted, the router treats it as unavailable and skips the target for requests that require it.
Use scripts/probe-model-capabilities.sh as the repeatable first pass for direct upstream evidence:
scripts/probe-model-capabilities.sh \
--base-url https://api.provider.example/v1 \
--model provider-model-id \
--api-key-env PROVIDER_API_KEY \
--dialect openai-chat \
--output yaml
Map the probe output mechanically:
| Probe | Passing metadata | Failure or not tested |
|---|---|---|
text | catalog the exact model ID with input_modalities: [text] and output_modalities: [text] | do not activate the target |
max-tokens-cap | omit honors_max_tokens because the default is true | set honors_max_tokens: false when the cap is not honored |
auto-tools | add tools, function, or client_tools for the tested skin | omit tool support for that skin |
forced-tools | add tool_choice where the skin uses OpenAI-style tool choice | omit tool_choice |
structured-outputs | add structured_outputs for the tested skin | omit structured-output support |
image-input | add image to input_modalities after router-level image smoke also passes | keep text-only modalities |
reasoning-effort or Anthropic thinking | add reasoning.supported, mode, and the tested control type | omit reasoning metadata |
Date-stamp the evidence in pricing_notes or validation notes. Include what passed, what failed, and what was not tested. A model passing one skin does not imply any other skin passed; validate OpenAI Chat, OpenAI Responses, and Anthropic Messages independently.
Rollback
If a cataloged model fails validation, remove it from every active models.<group>.targets[] entry first, keep the provider catalog entry only if it remains useful for smoke testing, and restart or reload through the normal deployment path. Remove capability fields such as structured_outputs, image, or tool_choice when only that request shape fails.
Related
See Providers And Models, Add A Provider Or Model, Model Metadata, Self-Hosted Upstreams, and Router Configuration.