Skip to Content
DocsGuidesAgent Matrix & Recommender

The agent matrix is a read-only capability view that joins the agent registry with your local usage statistics. It powers the recommendation badges you see when starting a feature and feeds the automated refresh cycle that keeps pricing and quality scores current.

What the matrix contains

For every (agent, model) pair registered in templates/agents/<id>.json, the matrix tracks:

FieldSourceDescription
pricingRegistryPublic API cost in $/M tokens (inputPerM, outputPerM)
notes.<op>RegistryHuman-readable capability note for each operation
score.<op>RegistryQualitative score 1–5 for each operation
lastRefreshAtRegistryWhen this entry was last updated by a refresh agent
statsLocal telemetryFeatures run, total cost, sessions for this model

The four operations are: draft, spec_review, implement, review. Scores and notes can differ per operation — a model that excels at implementation may score lower for spec writing.

Recommendation badges in the start modal

When you open the start modal for a feature, Aigon calls /api/recommendation/:type/:id to rank available agents for the relevant operation. The top candidates are shown alongside the agent/model dropdowns:

  • ✨ Best value — highest score-to-cost ratio
  • ⚡ Fastest — lowest latency / cost for this operation
  • 🎯 Highest quality — highest qualitative score for this operation

Badges are suggestions only. The user still picks the agent and model — the recommender never overrides your choice.

Cells with no benchmark data (zero sessions) fall back to qualitative score alone and show confidence: low in the ranking rationale. The recommender never invents numbers to fill empty cells.

How ranking works

The score for each (agent, model, operation) triplet is:

score = qualitative_score (1–5) − cost_penalty (normalised $/op)

qualitative_score comes from score.<op> in the agent registry. cost_penalty is derived from actual benchmark session cost in your local telemetry (stats-aggregate.perTriplet). Quarantined models are excluded from ranking by default.

Keeping the matrix current

Three built-in recurring templates handle ongoing maintenance:

Weekly benchmark (weekly-agent-matrix-benchmark)

Runs once per ISO week. For each (agent, model) cell that has stale or missing benchmark data, it launches a fresh Brewboard seed reset and runs the canonical implement fixture. Results flow through feature-close into stats.json, which stats-aggregate.js rolls up automatically — no manual step required.

Weekly pricing refresh (weekly-agent-matrix-pricing-refresh)

Runs once per ISO week. Scans vendor pricing pages and release notes, diffs against the current registry, and produces:

  • A patch file at .aigon/matrix-refresh/<YYYY-MM-DD>/proposed.json
  • One aigon feedback-create per change kind (pricing-update, new-model, deprecation, quarantine-candidate)

The refresh agent never mutates the registry directly. You review the feedback item and then apply it:

aigon matrix-apply <feedback-id>

Quarterly qualitative refresh (quarterly-agent-matrix-qualitative-refresh)

Runs once per ISO quarter. Scans SWE-bench Verified, the Aider polyglot leaderboard, LMArena, and community sources to propose updates to notes.<op> and score.<op>. Uses the same feedback-item + matrix-apply flow as the pricing refresh. Quarterly cadence avoids score churn from reacting to every model update announcement.

Applying a proposed change

When a refresh agent files a feedback item, review it with aigon feedback-triage <id> and then apply:

# Preview without writing aigon matrix-apply <feedback-id> --dry-run # Apply to templates/agents/<id>.json aigon matrix-apply <feedback-id> # Restart to pick up the change aigon server restart

matrix-apply only patches the fields in the proposal. Unrelated fields are untouched. See the matrix-apply reference for full details.

Agent registry fields reference

In templates/agents/<id>.json, each entry in cli.modelOptions[] can carry:

{ "value": "claude-opus-4-7", "label": "Opus 4.7", "pricing": { "inputPerM": 15.0, "outputPerM": 75.0 }, "notes": { "implement": "Strong on multi-file reasoning and architecture decisions.", "review": "Thorough but slow — best for high-stakes reviews." }, "score": { "draft": 4, "spec_review": 5, "implement": 5, "review": 5 }, "lastRefreshAt": "2026-04-20T10:00:00.000Z" }

Models with quality or safety issues can be marked quarantined rather than deleted:

{ "value": "some-model", "quarantined": { "reason": "Hallucinated test output in F358", "since": "2026-03-15" } }

Quarantined models are excluded from recommendations and shown with a warning in the Settings tab matrix view.

Last updated on