Benchmarks · prxy.monster

Latest Local Run

MCP optimizer

33.4%

Average tool-token reduction across five synthetic 120-tool MCP scenarios. Scenario range: 13.8% to 53.0%.

Semantic cache

98.5%

Hit rate in a repeated-seed synthetic workload: 1,000 queries over 10 seeds. Real hit rate depends on repetition and cache policy.

IPC compression

98.7%

Token reduction in the 90% context-utilization fixture. IPC does not trigger below the configured utilization threshold.

Run It

# local deterministic suite from the monorepo
pnpm --filter @prxy/benchmarks bench

# installed CLI, live authenticated endpoint
prxy bench --remote -n 100

Local mode proves module mechanics in a deterministic fixture. Remote mode is the number that matters for your own project because it exercises your prompts, tool catalog, cache policy, model mix, and network path.

Scenario Details

Module	Measured fixture	Latest result	How to read it
MCP Optimizer	120 synthetic tools, five task scenarios	33.4% average token reduction, p95 1.32ms	Shows request-scoped pruning mechanics. Bigger and noisier MCP catalogs can change the result.
Exact Cache	200 iterations; static 1.8s simulated provider call	0.004ms average cache-hit lookup	Shows local cache lookup overhead. It does not call a real provider.
Semantic Cache	1,000 repeated-seed synthetic queries	98.5% hit rate, p95 0.03ms lookup	Shows behavior on repeated questions. Treat this as a fixture result, not a production average.
IPC	50%, 75%, 90% context-utilization fixtures	0% at 50%; 98.4% at 75%; 98.7% at 90%	Shows thresholded compression: no compression until the utilization threshold is crossed.
Prompt Optimizer	100 prefix-stability iterations	100% prefix stability on, 1% off	Shows stable prompt prefix placement for provider prompt-cache friendliness.