Agent Model Router & Cost Optimizer
ReformCode routes specialist-agent work through a deterministic model router before the agent starts. The router chooses the best executable model for architecture, implementation, review, testing, security, and fix-forward work using quality, latency, cost, context size, provider health, and historical acceptance outcomes.
Why It Exists
Model choice is a product-quality decision, not a preference dropdown. A cheap model can be excellent for a small review but wrong for a risky implementation. A powerful model can be wasteful for deterministic security checks. The router makes that tradeoff explicit and auditable.
Routing Signals
- Task fit: Each candidate declares whether it supports architect, implementer, reviewer, tester, security, or fix-forward work.
- Execution mode: Tool-using agents are constrained to models with the current agent tool-loop adapter. This prevents routing to a model that looks good on paper but cannot safely execute the workspace loop.
- Quality score: Candidate quality is scored per task, with higher weight for complex implementation and fix-forward tasks.
- Latency: Median latency and optional latency budgets influence fast-path decisions.
- Cost: Estimated input/output token cost is converted into credit estimates and used by economy/balanced routing.
- Context size: Large prompts and workspaces route toward long-context models when the task does not require the current tool-loop adapter.
- Provider health: Unhealthy providers are demoted while still preserving an escape route if every configured provider is unhealthy.
- Historical outcomes: Telemetry can lift models that have produced accepted outcomes for similar agent roles.
Runtime Behavior
Every orchestrated agent pipeline now computes model-routing decisions before execution. Agent results carry:
- selected model ID
- selected provider
- routing strategy
- estimated credit cost
- fallback model IDs
- selection rationale
- warnings such as provider health, budget pressure, or context overflow
Security work uses the deterministic audit engine instead of spending model credits. Tool-using agent roles currently route to executable Anthropic tool-loop models; analysis-only routing can rank Gemini and OpenAI candidates for long-context and historical-outcome scenarios.
Continuous Evaluation
The continuous evaluation CI model-router suite now checks real routing behavior instead of placeholder provider ordering:
- complex tool-loop implementation chooses an executable high-quality model
- large-context architecture work chooses a long-window provider
- historical acceptance outcomes can lift a cheaper accepted reviewer model
- all-unhealthy provider states still preserve a fallback escape route
Operator Notes
- Keep
MODEL_ROUTER_CANDIDATESaligned with real provider adapters. - Do not mark a model as
agent_tool_loopcapable until the runner can execute its tool-call protocol safely. - Add evaluation fixtures before changing weights or candidate quality baselines.
- Treat cost savings as a win only when acceptance and review quality stay healthy.