Case study

AegisFlow: multi-agent operational decision simulation

A June 2026 project that models operational decision workflows with specialist agents, critic review, deterministic state mutation, and audit trails.

Context

AegisFlow explores a harder applied-AI problem than a single assistant response: coordinating multiple role-specific agents in operational workflows where state, safety, and auditability matter.

Problem

Operational decision simulations need role separation, conflict resolution, and traceable state changes. A useful system has to show how a recommendation was proposed, reviewed, resolved, and committed.

My role

I built the FastAPI backend, Next.js dashboard, orchestration and evaluation packages, shared schemas, deterministic state engine, model modes, Docker setup, and initial tests.

Constraints

The system had to remain usable without live model credentials, keep state transitions deterministic, and represent several domains without claiming real-world operational or clinical validation.

Architecture

The repo uses a FastAPI backend for session lifecycle and orchestration routing, a Next.js dashboard, shared Pydantic models, an orchestration package, an evaluation package, and a deterministic state engine.

Workflow graph

Each simulation cycle runs through context preparation, parallel specialist proposals, critic review, consensus resolution, state mutation, and commit logging. The implementation records execution steps and appends events to the session trail.

Safety and fallback behavior

The critic layer assigns confidence and safety flags before the resolved command is applied. Without a Gemini key, the system runs sandbox responses rather than pretending live model calls succeeded.

Important decisions

I separated probabilistic proposals from deterministic state mutation, used shared typed schemas across service boundaries, and made sandbox execution an explicit mode rather than a silent fallback.

Tradeoffs

A rule-based commit layer limits agent autonomy, but it makes simulations reproducible and reviewable. Supporting three domains demonstrates reuse while increasing the testing surface for domain-specific rules.

Testing evidence

The repository includes unit coverage for state transitions and confidence scoring, including improved, worsened, and safety-flagged decision paths.

Outcome

AegisFlow is the strongest current evidence of my positioning: full-stack applied AI work where orchestration, explicit state changes, safety review, and audit trails are treated as product requirements.

Current limitations

The project is a production-oriented simulation, not a validated operational system. Route-level API coverage, stable portfolio screenshots, and clearer live-versus-sandbox comparison remain unfinished.

What I would improve next

The next work is validation, screenshots, and deeper tests.

Add screenshots from the live dashboard into the portfolio once stable sample states are selected.
Add route-level API tests around session lifecycle and orchestration endpoints.
Expose clearer comparison between live Gemini execution and sandbox fallback behavior in the UI.