Product Capabilities

This page summarizes GenAI Smart Router capabilities for enterprise deployments.

Capabilities

Area	Capability
API compatibility	OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, filtered `/v1/models`, caller `/v1/usage`, health/version endpoints
Routing	Static, weighted, failover, dynamic-score, TypeScript-scripted, and external-policy model group routing
Caller governance	Router-issued caller tokens, per-key allow lists, rate limits, traffic shaping, token budgets, concurrency limits
Provider control	Server-side provider keys, deployment-defined provider catalogs, active targets separate from catalog metadata
Tools	Dialect-specific tool metadata and request filtering for OpenAI Chat, OpenAI Responses, and Anthropic Messages
Images/VLM	Image input detection across supported request shapes and modality-aware target filtering
Cost accounting	Request-time input/output/image prices, calculated cost fields, upstream-reported billed cost fields
Usage reporting	CLI Markdown reports and optional authenticated browser reports by caller, project, environment, token ID, provider, model, model group, client, status, cache, latency, and token counts
Observability	JSONL logs, relational usage DB, diagnostics child tables, optional governed content-capture tables, metrics-admin Prometheus telemetry
Caching	In-process LRU/TTL cache for eligible non-tool responses with cache snapshots in usage rows
Deployment	Linux binary and Docker Compose packages with embedded product documentation and optional embedded admin report assets
Agent clients	Codex CLI and Claude Code CLI workflows validated through router-compatible API shapes
Private upstreams	OpenAI-compatible vLLM, SGLang, Baseten-style, and other internal services can be configured as providers

Best Fit

GenAI Smart Router is designed as the governed gateway layer for enterprise GenAI traffic. It works well when a platform team wants to:

expose stable model-group names to developers and agents;
keep provider keys and private inference endpoints server-side;
route by API dialect, tool support, image modality, latency, cost, quota, and custom policy;
record request-time token, image, cost, cache, and provider selection details;
validate and roll out new upstream models without rewriting every client.

Broader enterprise controls such as identity integration, procurement process, managed hosting terms, and compliance workflows are handled as part of the selected deployment and operating model.

Activation Standard

Provider catalogs are metadata, not proof. A model should only become active after the exact deployment validates:

provider key entitlement;
direct upstream text behavior;
realistic token budget behavior;
small max-token cap behavior when caps matter;
tool behavior for the intended API shape;
image behavior when image modality is advertised;
router-level behavior through the exposed model group.

Models can remain catalog-only until validation passes.

Deployment Options

GenAI Smart Router can be deployed:

inside an enterprise network;
in a customer cloud account;
as a Metrum-managed evaluation or private-managed deployment (one customer per instance).

Deployment-specific hostnames, model group names, caller policies, provider keys, and upstream mixes are configuration choices, not product constants.

Capabilities​

Best Fit​

Activation Standard​

Deployment Options​

Capabilities

Best Fit

Activation Standard

Deployment Options