Skip to main content

Product Capabilities

This page summarizes GenAI Smart Router capabilities for enterprise deployments.

Capabilities

AreaCapability
API compatibilityOpenAI Chat Completions, OpenAI Responses, Anthropic Messages, filtered /v1/models, caller /v1/usage, health/version endpoints
RoutingStatic, weighted, failover, dynamic-score, TypeScript-scripted, and external-policy model group routing
Caller governanceRouter-issued caller tokens, per-key allow lists, rate limits, traffic shaping, token budgets, concurrency limits
Provider controlServer-side provider keys, deployment-defined provider catalogs, active targets separate from catalog metadata
ToolsDialect-specific tool metadata and request filtering for OpenAI Chat, OpenAI Responses, and Anthropic Messages
Images/VLMImage input detection across supported request shapes and modality-aware target filtering
Cost accountingRequest-time input/output/image prices, calculated cost fields, upstream-reported billed cost fields
Usage reportingCLI Markdown reports and optional authenticated browser reports by caller, project, environment, token ID, provider, model, model group, client, status, cache, latency, and token counts
ObservabilityJSONL logs, relational usage DB, diagnostics child tables, optional governed content-capture tables, metrics-admin Prometheus telemetry
CachingIn-process LRU/TTL cache for eligible non-tool responses with cache snapshots in usage rows
DeploymentLinux binary and Docker Compose packages with embedded product documentation and optional embedded admin report assets
Agent clientsCodex CLI and Claude Code CLI workflows validated through router-compatible API shapes
Private upstreamsOpenAI-compatible vLLM, SGLang, Baseten-style, and other internal services can be configured as providers

Best Fit

GenAI Smart Router is designed as the governed gateway layer for enterprise GenAI traffic. It works well when a platform team wants to:

  • expose stable model-group names to developers and agents;
  • keep provider keys and private inference endpoints server-side;
  • route by API dialect, tool support, image modality, latency, cost, quota, and custom policy;
  • record request-time token, image, cost, cache, and provider selection details;
  • validate and roll out new upstream models without rewriting every client.

Broader enterprise controls such as identity integration, procurement process, managed hosting terms, and compliance workflows are handled as part of the selected deployment and operating model.

Activation Standard

Provider catalogs are metadata, not proof. A model should only become active after the exact deployment validates:

  • provider key entitlement;
  • direct upstream text behavior;
  • realistic token budget behavior;
  • small max-token cap behavior when caps matter;
  • tool behavior for the intended API shape;
  • image behavior when image modality is advertised;
  • router-level behavior through the exposed model group.

Models can remain catalog-only until validation passes.

Deployment Options

GenAI Smart Router can be deployed:

  • inside an enterprise network;
  • in a customer cloud account;
  • as a Metrum-managed evaluation or private-managed deployment (one customer per instance).

Deployment-specific hostnames, model group names, caller policies, provider keys, and upstream mixes are configuration choices, not product constants.