prxy.monster claims should be tied to commands, traces, and receipts. This page publishes the local module benchmark suite and the exact caveat that matters: these are synthetic fixtures using in-memory storage and stub embeddings, not a promise about every production workload.
Fresh local run: 2026-05-04 on darwin-arm64 / Node v22.22.0 via pnpm --filter @prxy/benchmarks bench. Run prxy bench --remote against your own endpoint for live workload numbers.
Average tool-token reduction across five synthetic 120-tool MCP scenarios. Scenario range: 13.8% to 53.0%.
Hit rate in a repeated-seed synthetic workload: 1,000 queries over 10 seeds. Real hit rate depends on repetition and cache policy.
Token reduction in the 90% context-utilization fixture. IPC does not trigger below the configured utilization threshold.
# local deterministic suite from the monorepo
pnpm --filter @prxy/benchmarks bench
# installed CLI, live authenticated endpoint
prxy bench --remote -n 100
Local mode proves module mechanics in a deterministic fixture. Remote mode is the number that matters for your own project because it exercises your prompts, tool catalog, cache policy, model mix, and network path.
| Module | Measured fixture | Latest result | How to read it |
|---|---|---|---|
| MCP Optimizer | 120 synthetic tools, five task scenarios | 33.4% average token reduction, p95 1.32ms | Shows request-scoped pruning mechanics. Bigger and noisier MCP catalogs can change the result. |
| Exact Cache | 200 iterations; static 1.8s simulated provider call | 0.004ms average cache-hit lookup | Shows local cache lookup overhead. It does not call a real provider. |
| Semantic Cache | 1,000 repeated-seed synthetic queries | 98.5% hit rate, p95 0.03ms lookup | Shows behavior on repeated questions. Treat this as a fixture result, not a production average. |
| IPC | 50%, 75%, 90% context-utilization fixtures | 0% at 50%; 98.4% at 75%; 98.7% at 90% | Shows thresholded compression: no compression until the utilization threshold is crossed. |
| Prompt Optimizer | 100 prefix-stability iterations | 100% prefix stability on, 1% off | Shows stable prompt prefix placement for provider prompt-cache friendliness. |