Document Intake Agent
ProductionLangGraph agent with a deterministic test suite — the testing discipline AI agents need
Visit live demoProblem
AI agents are non-deterministic, which makes them hard to test — yet untested agents can't be trusted in production. The hard part of agent work is proving the workflow behaves correctly under controlled conditions.
Solution
A multi-node LangGraph graph (ingest → extract → validate/route → summarize) with a conditional retry loop, fronted by a FastAPI demo. The model is swapped for a scripted fake in tests — making the whole suite deterministic and free — while a real Claude model powers the live, PIN-gated demo.
Key Features
- Multimodal input — reads photos, scans (even handwriting), and PDFs via Claude vision
- LangGraph graph with a conditional retry edge (re-extract until schema-valid)
- Node-level pytest unit tests with mocked model state
- Graph-level integration tests with controlled inputs
- LLM mocking via a scripted fake model for deterministic, zero-cost tests
- Pydantic structured-output validation
- Regression tests that name the failure mode each one guards
- Playwright E2E test against the live FastAPI UI
- PIN-gated live Claude mode with a daily cost cap
- Containerized and deployed to GCP Cloud Run
Tech Stack
Architecture
LangGraph StateGraph with a dependency-injected LLM interface: production passes a real Claude client, tests pass a scripted FakeLLM, and the graph is unchanged. FastAPI front door with PIN-gated live mode and an in-memory daily cap. Deployed on Cloud Run (single instance) with secrets in Secret Manager. 96% test coverage; tiered suite (unit/graph by default, Playwright + real-model integration opt-in).
Screenshots
Screenshots coming soon
Metrics
My Role
Sole developer. Scoped the project to mirror a real AI/agent-testing role, built the graph and the full test apparatus AI-augmented, and deployed it live with cost guardrails.