Vandana jain VandanaJn

Vandana Jain

Staff Software Engineer — Distributed Systems · Backend Platforms · Agentic AI

15+ years architecting mission-critical infrastructure for regulated financial markets — now building production-grade agentic AI systems with the same engineering discipline.

About

Staff Software Engineer with 15+ years owning distributed systems and backend platforms for regulated financial markets. At the Municipal Securities Rulemaking Board, I architected and modernized the infrastructure behind regulatory reporting for a $4T municipal securities market — systems supporting $55M+ in annual revenue and the integrity of 70K+ daily trade reports.

Today I focus on the next layer of the stack: production-grade agentic AI. I bring the same discipline — distributed systems rigor, evaluation-driven reliability, observability — to LLM platforms and multi-agent systems, the part of the stack where most prototypes break.

Distributed Systems & Platform Engineering

Mission-critical work for U.S. municipal securities infrastructure (not public — summarized here for context):

Identity architecture across 15+ legacy .NET and modernized systems — designed a hybrid identity bridge that preserved backward compatibility, eliminated $250K+ in projected refactoring, and held sub-second authentication latency for market-data services.
Cloud performance — established Lambda initialization and execution standards that improved serverless function performance 50–70% across distributed workflows.
Data platform modernization — re-architected the MDM entity resolution platform's processing and queue strategy for a 40% throughput increase, unblocking downstream ML workflows.
Legacy modernization — led the RTRS transaction-reporting system from a Java monolith to a scalable .NET architecture, with multi-layer validation safeguarding 70K+ daily trades / $1.1T+ in annual par value.
Quality architecture — replaced mock-based testing with a PostgreSQL + Alembic integration-testing framework for production data pipelines.

Agentic AI Portfolio

Open-source systems from my independent AI engineering practice — production patterns, not demos.

Repo Navigator AI | Architecture Walkthrough

Autonomous multi-agent system on Google ADK for repository analysis — built to compress developer onboarding and architectural discovery. In client work, this class of system accelerated architectural discovery by ~70%, cutting onboarding from days to hours.

The engineering problem: LLMs hallucinate on multi-file repository analysis because naive context stuffing overwhelms the window and loses execution flow. Solved with a hierarchical agent design — a root orchestrator delegates to specialized sub-agents (Architecture Agent, File Summarizer Agent), each with bounded responsibility and tool access.

Staff-level decisions:

FakeGithub client injection for deterministic, CI-stable integration tests — live API calls don't belong in CI
Threshold-based evaluations via test_config.json: agent output scored against configurable minimums, not eyeballed
Automated deployment pipeline to Vertex AI via GitHub Actions
Context compaction strategy to prevent agent drift across multi-turn analysis

Stack: Python · Google ADK · Vertex AI · pytest · GitHub Actions

Enterprise Architecture Copilot

RAG system for querying internal ADRs, runbooks, and structured metadata — institutional knowledge that usually lives in Confluence and dies there.

The engineering problem: Enterprise docs are too large for direct injection and too heterogeneous for naive chunking. Solved with semantic chunking (LangChain SemanticChunker) and a hybrid retrieval path combining semantic vector search with exact-match SQL filtering for structured metadata.

Staff-level decisions:

MD5 document hashing for idempotent vector insertion — reruns don't accumulate duplicates that degrade retrieval quality
Hybrid retrieval: structured metadata joins from SQLite alongside semantic similarity from ChromaDB, routed by a LangGraph agent
LangSmith tracing and golden-set evaluations with minimum score thresholds enforced in CI

Stack: Python · LangChain · LangGraph · OpenAI · ChromaDB · SQLite · LangSmith

Sage AI (private project — sample & screenshots)

Full-stack production RAG pipeline with source-grounded generation — React frontend, FastAPI backend, vector-store retrieval.

Engineering highlight: Sentence Window Parsing for ingestion — stores surrounding context for retrieval without inflating the embedding, balancing retrieval precision against context quality.

Stack: React · FastAPI · LlamaIndex · Pinecone · OpenAI GPT-4o-mini

Engineering Approach

Bringing distributed-systems discipline to non-deterministic AI:

Evaluation first, not last. Threshold-based golden-set evaluations (LangSmith, pytest) defined before the first agent is wired up — not bolted on after.
Deterministic testing in non-deterministic systems. Fake client injection and tool-response mocking make LLM agent tests reproducible in CI without live API calls.
Idempotency at the data layer. MD5 document hashing for vector upserts eliminates duplication-driven hallucinations during pipeline reruns.
Context window as a constraint. Explicit memory compaction strategies prevent agent drift across long-horizon tasks.
Architecture over syntax. Structural code boundaries that keep systems maintainable as LLM APIs evolve.

Technical Thought Leadership

Python Decorators as Architectural Boundaries: Decorators as structural middleware for cross-cutting concerns — demonstrated with a self-healing @retry pattern designed for the failure modes of LLM and network APIs.
Developing curriculum on production multi-agent system design: evaluation-driven development, context management, and testing patterns for non-deterministic systems.

Core Technical Scope

Domain	Technologies
Distributed Systems & Backend	Event-driven architecture, API design, data pipelines, platform modernization
Cloud Infrastructure	AWS (Lambda, API Gateway, SNS/SQS, EventBridge, Bedrock, CDK/SAM) · GCP (Vertex AI, Cloud Run)
Agentic AI	Google ADK, LangGraph, LangChain, LlamaIndex, multi-agent orchestration
LLM Platforms & RAG	OpenAI, Vertex AI, AWS Bedrock, semantic + hybrid retrieval
Evaluation & Observability	LangSmith, pytest, threshold-based golden-set evals, integration testing
Languages & Data	Python, C#/.NET, TypeScript/React, SQL (PostgreSQL, Oracle), DynamoDB, Pinecone, ChromaDB

"Precision over prediction. Architecture over syntax."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly