Skip to main content

Report Examples

This page shows anonymized production-derived report excerpts. Names, public token IDs, request IDs, IP addresses, provider names, upstream endpoint names, model names, and deployment-specific group names have been replaced with generic labels. The examples preserve the shape of the reporting output so platform teams can see how GenAI Smart Router supports cost governance and performance triage.

The savings examples compare stored router cost against a documented reference baseline:

BaselineInput priceOutput price
Reference baseline A$5.00 / 1M input tokens$30.00 / 1M output tokens

Savings are calculated as:

reference_cost = input_tokens / 1,000,000 * reference_input_price
+ output_tokens / 1,000,000 * reference_output_price

savings = reference_cost - stored_router_cost

Router cost is the stored request-time cost from usage rows. It is not recalculated from current provider metadata.

Graphical Summary

Traffic

Daily request volume

Savings

Router cost vs reference baseline

User Experience

Average latency by day

Caller Cohorts

Savings by anonymized team

Caller Cohorts

Average latency by anonymized team

Upstream Performance

Slowest anonymized upstream endpoints

Daily Usage And Savings

This excerpt answers: how much traffic ran each day, what did it cost through the router, what would it have cost under the reference baseline, and what user-facing latency did the deployment see?

Day UTCCallsErrorsInput tokensOutput tokensRouter costReference costSavingsAvg latency msMax latency msAvg upstream output tok/s
2026-06-172,27723678,767,5121,058,468$14.62$425.59$410.9713,390297,53738.57
2026-06-182,77337261,897,6311,520,263$45.20$355.10$309.9011,782303,09850.52
2026-06-192,782134137,590,8781,991,097$56.54$747.69$691.1511,966267,86556.89
2026-06-201,7257186,558,339708,877$31.75$454.06$422.3110,665121,14852.59
2026-06-211,33328200,204,234420,117$70.47$1,013.62$943.157,765120,46243.60
2026-06-223,142349116,476,3131,037,342$135.54$613.50$477.978,224259,20247.22
Total14,0321,190681,494,9076,736,164$354.12$3,609.56$3,255.4410,632303,09848.23

The daily view is useful for spend reviews, quota planning, rollout comparisons, and incident timelines. A rising error count or max latency spike can be investigated with the upstream endpoint and per-request sections in the generated report.

Caller Usage And Savings

This excerpt answers: which caller cohorts are driving spend, savings, token volume, and latency?

Caller cohortProjectClientCallsErrorsInput tokensOutput tokensRouter costReference costSavingsAvg latency msMax latency msAvg downstream write output tok/s
Team AProject AClient A1,63465187,520,679802,818$64.50$961.69$897.1913,442121,148509,579.56
Team BProject BClient B1,01421175,094,108292,520$63.80$884.25$820.447,849120,462291,263.34
Team CProject CClient C1,59512450,312,249900,358$41.48$278.57$237.0916,988303,098696,380.87
Team DProject DClient D4705142,551,897222,225$12.96$219.43$206.4710,643121,009528,595.47
Team EProject EClient E4982736,619,786146,357$86.28$187.49$101.215,36361,40678,517.37
Team FProject FClient F7846432,061,835330,830$5.30$170.23$164.9314,971295,382460,312.09

The caller view supports chargeback, quota reviews, and support triage. For example, a team with moderate cost but high average latency may need a different model-group policy, a higher timeout, or a provider mix with stronger streaming behavior.

Upstream Endpoint Performance

This excerpt answers: which anonymized upstream endpoint/API combinations are slow, error-prone, or fallback-heavy?

EndpointAPI shapeCallsErrorsAttemptsFallbacksTotal tokensRouter costAvg upstream msMax upstream msAvg upstream output tok/sAvg latency ms
Endpoint AAPI A511116,742$0.0142,969132,68846.7542,972
Endpoint BAPI B561577004733,287,316$1.3136,077303,06722.6636,208
Endpoint CAPI C2216232319,435,221$4.0425,410119,13276.6925,908
Endpoint DAPI D100100212,619$0.0922,855122,64138.7522,935
Endpoint EAPI E52094302,201,353$0.2421,01586,371122.0221,019
Endpoint FAPI F26233751111,870,726$1.9820,581205,66755.2819,419
Endpoint GAPI G3703601,938,639$2.6520,572116,17920.4320,441
Endpoint HAPI H120120288,474$0.1019,27264,73034.4119,459

The endpoint view is the fastest place to identify upstreams that compromise user experience. High average upstream duration, low token throughput, elevated attempts, or fallback pressure can justify lowering a target weight, changing timeout policy, isolating a provider to lower-priority groups, or opening a provider incident.

Report Generation Commands

Generate a usage report for a recent window:

router-usage-report \
--driver postgres \
--dsn "$ROUTER_USAGE_DB_DSN" \
--since 7d \
--out usage-7d.md

Generate a report for one caller cohort or rollout:

router-usage-report \
--driver postgres \
--dsn "$ROUTER_USAGE_DB_DSN" \
--caller-user <owner-user> \
--caller-project <project> \
--caller-environment <environment> \
--out usage-project.md

Generated markdown reports include usage by internal key, caller, project, environment, caller IP, model group, provider/model, status, cache behavior, hour, day, downstream user performance, upstream endpoint performance, and per-request throughput. Savings tables are produced by comparing the generated report totals with a separately documented reference baseline.