Back to portfolio

Case study

AegisFlow: multi-agent operational decision simulation

A June 2026 project that models operational decision workflows with specialist agents, critic review, deterministic state mutation, and audit trails.

01

Context

AegisFlow explores a harder applied-AI problem than a single assistant response: coordinating multiple role-specific agents in operational workflows where state, safety, and auditability matter.

02

Problem

Operational decision simulations need role separation, conflict resolution, and traceable state changes. A useful system has to show how a recommendation was proposed, reviewed, resolved, and committed.

03

Architecture

The repo uses a FastAPI backend for session lifecycle and orchestration routing, a Next.js dashboard, shared Pydantic models, an orchestration package, an evaluation package, and a deterministic state engine.

04

Workflow graph

Each simulation cycle runs through context preparation, parallel specialist proposals, critic review, consensus resolution, state mutation, and commit logging. The implementation records execution steps and appends events to the session trail.

05

Safety and fallback behavior

The critic layer assigns confidence and safety flags before the resolved command is applied. Without a Gemini key, the system runs sandbox responses rather than pretending live model calls succeeded.

06

Testing evidence

The repository includes unit coverage for state transitions and confidence scoring, including improved, worsened, and safety-flagged decision paths.

07

Result

AegisFlow is the strongest current evidence of my positioning: full-stack applied AI work where orchestration, explicit state changes, safety review, and audit trails are treated as product requirements.

What I would improve next

The next work is validation, screenshots, and deeper tests.

  • Add screenshots from the live dashboard into the portfolio once stable sample states are selected.
  • Add route-level API tests around session lifecycle and orchestration endpoints.
  • Expose clearer comparison between live Gemini execution and sandbox fallback behavior in the UI.