Reproducible proof

Benchmarks you can rerun.

prxy.monster claims should be tied to commands, traces, and receipts. This page publishes the local module benchmark suite and the exact caveat that matters: these are synthetic fixtures using in-memory storage and stub embeddings, not a promise about every production workload.

Fresh local run: 2026-05-04 on darwin-arm64 / Node v22.22.0 via pnpm --filter @prxy/benchmarks bench. Run prxy bench --remote against your own endpoint for live workload numbers.

Latest Local Run

MCP optimizer
33.4%

Average tool-token reduction across five synthetic 120-tool MCP scenarios. Scenario range: 13.8% to 53.0%.

Semantic cache
98.5%

Hit rate in a repeated-seed synthetic workload: 1,000 queries over 10 seeds. Real hit rate depends on repetition and cache policy.

IPC compression
98.7%

Token reduction in the 90% context-utilization fixture. IPC does not trigger below the configured utilization threshold.

Run It

# local deterministic suite from the monorepo
pnpm --filter @prxy/benchmarks bench

# installed CLI, live authenticated endpoint
prxy bench --remote -n 100

Local mode proves module mechanics in a deterministic fixture. Remote mode is the number that matters for your own project because it exercises your prompts, tool catalog, cache policy, model mix, and network path.

Scenario Details

Module Measured fixture Latest result How to read it
MCP Optimizer 120 synthetic tools, five task scenarios 33.4% average token reduction, p95 1.32ms Shows request-scoped pruning mechanics. Bigger and noisier MCP catalogs can change the result.
Exact Cache 200 iterations; static 1.8s simulated provider call 0.004ms average cache-hit lookup Shows local cache lookup overhead. It does not call a real provider.
Semantic Cache 1,000 repeated-seed synthetic queries 98.5% hit rate, p95 0.03ms lookup Shows behavior on repeated questions. Treat this as a fixture result, not a production average.
IPC 50%, 75%, 90% context-utilization fixtures 0% at 50%; 98.4% at 75%; 98.7% at 90% Shows thresholded compression: no compression until the utilization threshold is crossed.
Prompt Optimizer 100 prefix-stability iterations 100% prefix stability on, 1% off Shows stable prompt prefix placement for provider prompt-cache friendliness.