Council of Advisors — Test Cases & Eval

Should-trigger test cases

“I’m trying to decide between applying for a $50K grant with a 2-week deadline that’s only 60% aligned to our mission, or waiting for a better opportunity next quarter. Convene the council.”
“We’ve been offered a partnership with an edtech company that wants to co-brand our AI literacy curriculum. They’d fund our pilot but their product collects student usage data. I can’t decide. Council?”
“The council — I need perspective. We’re thinking about pivoting from school-based programs to a train-the-trainer model to scale faster. My gut says yes but I want this stress-tested.”
“Get me multiple perspectives on whether we should publicly criticize a competitor nonprofit’s AI literacy program that we think is teaching kids to trust AI uncritically.”
“I keep coming back to the same two options for our fundraising strategy and I don’t love either of them. What would the council say?”
“I feel like I’m missing something obvious about why our Discord community isn’t growing despite consistent posting. Call a council session.”

Triggering: correct on/off behavior for all 12 tests.
Queue integration: all 6 advisor seats are enqueued immediately (fast registration), not directly run in parallel.
Polling behavior: council polls callback files until each seat reaches complete|timeout|error.
Cleanup behavior: consumed callback files are removed after ingest.
Runtime envelope: full six-seat sequence typically lands in ~3–4 minutes (acceptable), avoiding saturation/timeouts.
Distinctness: six advisors produce genuinely distinct perspectives, not minor variants.
Magnus quality: surfaces a real implementation gap, not mere restatement.
Dante quality: presents a substantive adversarial case, not only clarifying questions.
Flow integrity: all 5 steps executed in order.
Cross-pollination: max one round; only when sharp conflict exists.
Synthesis quality: explicit decision, what shifted, and what was set aside.
Disagreement clarity: Burt explicitly states where and why he disagrees with at least one advisor when applicable.
Constraint compliance: no council for routine/deterministic/time-critical emergency paths.
Persistence: record saved to ./data/council/YYYY-MM-DD-[topic-slug].md and retrievable.
Readability: complete session output should be scannable in <= 5 minutes.
Delivery: full record to Burt direct; 100-word synthesis routed by channel matrix.
Operationality: next actions executable within 24h.