Staff Software Engineer — Distributed Systems · Backend Platforms · Agentic AI
15+ years architecting mission-critical infrastructure for regulated financial markets — now building production-grade agentic AI systems with the same engineering discipline.
Staff Software Engineer with 15+ years owning distributed systems and backend platforms for regulated financial markets. At the Municipal Securities Rulemaking Board, I architected and modernized the infrastructure behind regulatory reporting for a $4T municipal securities market — systems supporting $55M+ in annual revenue and the integrity of 70K+ daily trade reports.
Today I focus on the next layer of the stack: production-grade agentic AI. I bring the same discipline — distributed systems rigor, evaluation-driven reliability, observability — to LLM platforms and multi-agent systems, the part of the stack where most prototypes break.
Mission-critical work for U.S. municipal securities infrastructure (not public — summarized here for context):
- Identity architecture across 15+ legacy .NET and modernized systems — designed a hybrid identity bridge that preserved backward compatibility, eliminated $250K+ in projected refactoring, and held sub-second authentication latency for market-data services.
- Cloud performance — established Lambda initialization and execution standards that improved serverless function performance 50–70% across distributed workflows.
- Data platform modernization — re-architected the MDM entity resolution platform's processing and queue strategy for a 40% throughput increase, unblocking downstream ML workflows.
- Legacy modernization — led the RTRS transaction-reporting system from a Java monolith to a scalable .NET architecture, with multi-layer validation safeguarding 70K+ daily trades / $1.1T+ in annual par value.
- Quality architecture — replaced mock-based testing with a PostgreSQL + Alembic integration-testing framework for production data pipelines.
Open-source systems from my independent AI engineering practice — production patterns, not demos.
Autonomous multi-agent system on Google ADK for repository analysis — built to compress developer onboarding and architectural discovery. In client work, this class of system accelerated architectural discovery by ~70%, cutting onboarding from days to hours.
The engineering problem: LLMs hallucinate on multi-file repository analysis because naive context stuffing overwhelms the window and loses execution flow. Solved with a hierarchical agent design — a root orchestrator delegates to specialized sub-agents (Architecture Agent, File Summarizer Agent), each with bounded responsibility and tool access.
Staff-level decisions:
FakeGithubclient injection for deterministic, CI-stable integration tests — live API calls don't belong in CI- Threshold-based evaluations via
test_config.json: agent output scored against configurable minimums, not eyeballed - Automated deployment pipeline to Vertex AI via GitHub Actions
- Context compaction strategy to prevent agent drift across multi-turn analysis
Stack: Python · Google ADK · Vertex AI · pytest · GitHub Actions
RAG system for querying internal ADRs, runbooks, and structured metadata — institutional knowledge that usually lives in Confluence and dies there.
The engineering problem: Enterprise docs are too large for direct injection and too heterogeneous for naive chunking. Solved with semantic chunking (LangChain SemanticChunker) and a hybrid retrieval path combining semantic vector search with exact-match SQL filtering for structured metadata.
Staff-level decisions:
- MD5 document hashing for idempotent vector insertion — reruns don't accumulate duplicates that degrade retrieval quality
- Hybrid retrieval: structured metadata joins from SQLite alongside semantic similarity from ChromaDB, routed by a LangGraph agent
- LangSmith tracing and golden-set evaluations with minimum score thresholds enforced in CI
Stack: Python · LangChain · LangGraph · OpenAI · ChromaDB · SQLite · LangSmith
Sage AI (private project — sample & screenshots)
Full-stack production RAG pipeline with source-grounded generation — React frontend, FastAPI backend, vector-store retrieval.
Engineering highlight: Sentence Window Parsing for ingestion — stores surrounding context for retrieval without inflating the embedding, balancing retrieval precision against context quality.
Stack: React · FastAPI · LlamaIndex · Pinecone · OpenAI GPT-4o-mini
Bringing distributed-systems discipline to non-deterministic AI:
- Evaluation first, not last. Threshold-based golden-set evaluations (LangSmith, pytest) defined before the first agent is wired up — not bolted on after.
- Deterministic testing in non-deterministic systems. Fake client injection and tool-response mocking make LLM agent tests reproducible in CI without live API calls.
- Idempotency at the data layer. MD5 document hashing for vector upserts eliminates duplication-driven hallucinations during pipeline reruns.
- Context window as a constraint. Explicit memory compaction strategies prevent agent drift across long-horizon tasks.
- Architecture over syntax. Structural code boundaries that keep systems maintainable as LLM APIs evolve.
- Python Decorators as Architectural Boundaries: Decorators as structural middleware for cross-cutting concerns — demonstrated with a self-healing
@retrypattern designed for the failure modes of LLM and network APIs. - Developing curriculum on production multi-agent system design: evaluation-driven development, context management, and testing patterns for non-deterministic systems.
| Domain | Technologies |
|---|---|
| Distributed Systems & Backend | Event-driven architecture, API design, data pipelines, platform modernization |
| Cloud Infrastructure | AWS (Lambda, API Gateway, SNS/SQS, EventBridge, Bedrock, CDK/SAM) · GCP (Vertex AI, Cloud Run) |
| Agentic AI | Google ADK, LangGraph, LangChain, LlamaIndex, multi-agent orchestration |
| LLM Platforms & RAG | OpenAI, Vertex AI, AWS Bedrock, semantic + hybrid retrieval |
| Evaluation & Observability | LangSmith, pytest, threshold-based golden-set evals, integration testing |
| Languages & Data | Python, C#/.NET, TypeScript/React, SQL (PostgreSQL, Oracle), DynamoDB, Pinecone, ChromaDB |
"Precision over prediction. Architecture over syntax."

