Deployment Security Assessment
Use this checklist to assess a GenAI Smart Router deployment before production rollout or after a major routing, provider, auth, or telemetry change.
Secrets And Authentication
Confirm:
- provider API keys are loaded server-side from the deployment environment or protected config files;
- licensed deployments mount only the issued runtime license file; private signing keys, signing-service credentials, full license payloads, and detached signatures are excluded from source control, images, logs, reports, browser docs, and tickets;
- caller tokens are distributed only to approved users, services, or validation jobs;
- runtime config stores caller token hashes, not raw caller token secrets;
- raw provider keys, raw router tokens, token hashes, and full production config are excluded from browser docs, logs, tickets, and announcements;
- metrics-admin access uses separate caller subjects authorized for
metricsread, with existingmetrics_admin: truecallers converted to compatible grants; - content-capture maintenance access uses Casbin
content:capturedelete/purgepolicy, delete-by-request is scoped to the captured row's caller project/environment domain, and existingcontent_admin: truecallers are converted to compatible grants for their own domain. - browser-admin Basic Auth, when enabled, uses bcrypt hashes from deployment secrets, requires HTTPS in production, trusts forwarded HTTPS state only from configured proxy CIDRs, and maps to stable subjects such as
basic:admin; see Admin Authentication. - browser admin reports, when enabled, require Casbin
admin:reportspolicy in addition to browser-admin identity, and reject ordinary router caller tokens with403 reports-forbidden.
Acceptance checks:
- ordinary caller token can access allowed API paths;
- ordinary caller token receives
403 metrics-forbiddenon/metrics; - metrics-admin token can scrape
/metrics; - ordinary caller token receives
403 content-forbiddenon content-capture maintenance endpoints; - ordinary caller token receives
403 reports-forbiddenon/admin/reports/*when reports are enabled; - authorized Basic Auth or OIDC session report subject can read
/admin/reports/api/summary?since=24h; /v1/modelsreturns only groups allowed for the presented token.
Diagnostics And Data Handling
Confirm diagnostic records exclude:
- raw prompts;
- raw image payloads;
- raw router tokens;
- token hashes;
- provider API keys;
- full upstream headers;
- unsanitized upstream response bodies.
Expected diagnostics include request IDs, selected provider/model, attempt summaries, status, latency, sanitized errors, token counts, image counters, cost fields, cache behavior, and fallback events.
Governed content capture is disabled by default. If a deployment enables it, confirm captured rows are redacted before storage, keyed by request_id, subject to retention purge, and maintained through audited content:capture delete/purge operations scoped to the captured row's caller project/environment domain.
Network And Private Upstreams
For private vLLM, SGLang, Baseten-style, or other OpenAI-compatible upstreams:
- keep upstream endpoints on private network names where possible;
- expose the router as the governed ingress point;
- validate direct upstream reachability from the router host or network;
- validate router-level access from approved clients;
- keep private upstream tokens in the deployment environment;
- configure media-domain restrictions for VLM services that fetch image URLs.
For TypeScript router.fetchJSON and strategy: external policy egress:
- keep policy services inside trusted infrastructure because they receive routing context and eligible target metadata;
- use exact hostname allowlists, not suffixes or wildcards;
- prefer HTTPS for all non-local policy services;
- approve
script_http.allow_http: trueorexternal_policy.allow_http: trueonly for trusted internal endpoints that cannot use HTTPS; - verify redirects are blocked when they point to non-allowlisted hosts, including loopback addresses that were not listed.
- enforce private-network and CIDR restrictions with deployment network policy until native router CIDR egress controls are added.
API Surface Review
Validate:
/readyzand/versionexpose build metadata without secrets;/v1/chat/completions,/v1/responses, and/v1/messagesenforce caller auth, allow lists, quotas, and target eligibility;/v1/usagereturns caller-appropriate usage visibility;/admin/reports/*, when enabled, returns only safe scalar report data and local embedded assets after browser-admin identity plus Casbin authorization;/metricsis restricted to metrics-admin tokens;- no private host paths, SSH details, provider keys, or raw tokens appear in hosted docs.
Dependency, Container, And Package Scans
Each deployment should follow the organization's security process for:
- container image vulnerability scanning;
- dependency scanning for packaged docs and runtime dependencies;
- verification that release images are built with the pinned Docker builder Go patch version, not a floating language image tag;
- secret scanning of release artifacts;
- static analysis or policy review for TypeScript routing scripts;
- review of third-party script dependencies before packaging;
- verification that the router does not install packages or fetch dependency code at runtime.
The Docker builder image is pinned to a patched Go toolchain tag so reachable Go standard-library advisories are controlled by the image patch version used for the release build.
As of 2026-06-24, the docs build still reports moderate npm audit advisories through Docusaurus' gray-matter dependency on js-yaml@3. Docusaurus has no patched dependency path for that finding yet. The affected package is used during documentation build and content parsing, not in the router request path; deployment reviews should record the residual, keep authored docs inputs trusted, and re-run npm audit --prefix docs-site --audit-level=moderate when Docusaurus publishes a fix.
Security Sign-Off Record
Record:
- deployment version and build timestamp;
- config backup path;
- token policy summary;
- provider key storage method;
- metrics-admin token owner;
- private upstream network policy;
- scan tooling and result summary;
- accepted residual risks;
- rollback owner and rollback command path.