Agent Model Router & Cost Optimizer

ReformCode routes specialist-agent work through a deterministic model router before the agent starts. The router chooses the best executable model for architecture, implementation, review, testing, security, and fix-forward work using quality, latency, cost, context size, provider health, and historical acceptance outcomes.

Why It Exists

Model choice is a product-quality decision, not a preference dropdown. A cheap model can be excellent for a small review but wrong for a risky implementation. A powerful model can be wasteful for deterministic security checks. The router makes that tradeoff explicit and auditable.

Routing Signals

Task fit: Each candidate declares whether it supports architect, implementer, reviewer, tester, security, or fix-forward work.
Execution mode: Tool-using agents are constrained to models with the current agent tool-loop adapter. This prevents routing to a model that looks good on paper but cannot safely execute the workspace loop.
Quality score: Candidate quality is scored per task, with higher weight for complex implementation and fix-forward tasks.
Latency: Median latency and optional latency budgets influence fast-path decisions.
Cost: Estimated input/output token cost is converted into credit estimates and used by economy/balanced routing.
Context size: Large prompts and workspaces route toward long-context models when the task does not require the current tool-loop adapter.
Provider health: Unhealthy providers are demoted while still preserving an escape route if every configured provider is unhealthy.
Historical outcomes: Telemetry can lift models that have produced accepted outcomes for similar agent roles.
Agent governance memory: Accepted/rejected edits, validation failures, fix-forward outcomes, Git proof, deploy certification, white-label decisions, and recipe handoffs can bias risky tasks toward quality and acceptance confidence.

Runtime Behavior

Every orchestrated agent pipeline now computes model-routing decisions before execution. Agent results carry:

selected model ID
selected provider
routing strategy
estimated credit cost
fallback model IDs
selection rationale
warnings such as provider health, budget pressure, or context overflow

Security work uses the deterministic audit engine instead of spending model credits. Tool-using agent roles currently route to executable Anthropic tool-loop models; analysis-only routing can rank Gemini and OpenAI candidates for long-context and historical-outcome scenarios.

When an attached governance-memory report is medium or high risk, the router explicitly records that signal in the routing rationale. High-risk memory adds a warning and shifts weight away from cheap/fast choices toward higher-quality models and accepted-outcome confidence.

Continuous Evaluation

The continuous evaluation CI model-router suite now checks real routing behavior instead of placeholder provider ordering:

complex tool-loop implementation chooses an executable high-quality model
large-context architecture work chooses a long-window provider
historical acceptance outcomes can lift a cheaper accepted reviewer model
all-unhealthy provider states still preserve a fallback escape route

Operator Notes

Keep MODEL_ROUTER_CANDIDATES aligned with real provider adapters.
Do not mark a model as agent_tool_loop capable until the runner can execute its tool-call protocol safely.
Add evaluation fixtures before changing weights or candidate quality baselines.
Treat cost savings as a win only when acceptance and review quality stay healthy.
Keep governance-memory inputs privacy-safe; never pass raw source, prompts, secrets, or local paths into the router memory contract.