Scout is a structured, AI-powered framework for running end-to-end experience evaluations. Point it at any product experience — a web portal, VS Code extension, CLI tool, AI agent, or API — and it delivers an actionable report with prioritized recommendations backed by real evidence.
Scout orchestrates a team of specialized AI agents that evaluate your experience from three angles:
| Signal | Agent | What It Does |
|---|---|---|
| 🔍 Walkthrough | Experience Walker | Walks through your experience step-by-step, scoring against user-defined quality primitives |
| 💬 Research | Researcher | Gathers user sentiment from GitHub issues, forums, and community + deep competitive intelligence |
| 📊 Data | Data Analyst | Analyzes telemetry and funnel drop-offs (requires Kusto/ADX access — see note below) |
| 📋 Actionable Next Steps | Report Writer | Delivers a prioritized list of improvements based on the findings — each with the problem, a concrete fix, the impact it drives, and what to start on first |
These signals are synthesized by the Report Writer into a comprehensive evaluation report. The most valuable output is the prioritized next steps — a ranked list of recommendations (P0 → P3) where each one explains the problem, suggests a specific fix, and quantifies the impact so teams know exactly what to work on and why.
📊 Data signal availability: The Data Analyst requires access to a Kusto/Azure Data Explorer cluster with telemetry data. Without it, Scout runs as a 2-signal framework (walkthrough + research) and delivers a Data Strategy Memo suggesting what to measure. The 2-signal report is still a complete, valuable evaluation — no fake data or empty sections.
- Primitives Scorecard — every quality dimension rated 1–5 with evidence
- Funnel Analysis — where users drop off and why (requires Kusto/ADX access; gracefully degrades to a Data Strategy Memo)
- Competitive Intelligence — how competitors approach the same problem, innovation signals, market direction
- Cross-Signal Correlations — connections between walkthrough findings, user sentiment, and data (depth depends on available telemetry access; report adapts when unavailable)
- Prioritized Recommendations — P0 through P3, each with problem + fix + impact
- GitHub Issues — optionally file every recommendation as a tracked issue
| Type | Examples | Primary Tools |
|---|---|---|
| Portal / Web UI | Azure Portal, dashboards, admin consoles | Playwright MCP |
| VS Code Extension | Extensions in VS Code desktop or web | Playwright MCP (web), user-assisted walkthrough (desktop) |
| CLI | az, gh, npm, developer CLIs |
Terminal |
| Skill / Agent | AI agents, Copilot skills | Waza, Terminal |
| API / SDK | REST APIs, client SDKs | Terminal, HTTP tools |
Scout ships with tools pre-configured and recommends additional ones based on your evaluation:
| Tool | What It Does | How It's Configured |
|---|---|---|
| Playwright MCP | Browser automation — walks web UIs, takes screenshots, interacts with elements | npm dependency + MCP server config |
| Terminal | Command execution for CLI evaluations and data collection | Built-in to VS Code |
| Tool | What It Does | How to Get It |
|---|---|---|
| Kusto Workbench | KQL queries against Azure Data Explorer for telemetry analysis | VS Code Extension (auto-recommended) |
| GitHub CLI | Issue filing (Phase 5), repo search, user feedback collection | System install: winget install GitHub.cli / brew install gh |
| Waza | Skill/Agent invocation and evaluation | Go binary (manual) |
| Tool | What It Does |
|---|---|
| fetch_webpage | Fetches web pages for research — forums, docs, competitor sites |
| Semantic Search | Searches the codebase for relevant patterns |
- Node.js >= 18.0.0 — required for Playwright MCP
- VS Code with GitHub Copilot — the agents run in Copilot Agent Mode
- GitHub CLI — optional, needed for issue filing (Phase 5)
⚠️ Copilot Agent Mode is required. Scout's agents run entirely in GitHub Copilot's Agent Mode. Make sure you have an active Copilot subscription and that Agent Mode is enabled in your VS Code settings.
Workspace Trust: When you first open the Scout repo, VS Code may ask you to trust the workspace. Click "Yes, I trust the authors" — this is required for the auto-install task and MCP server to function.
🔐 Authentication for portals: For experiences that require login (Azure Portal, admin consoles, etc.), Scout will open the browser and ask you to log in once manually. After that, the agent takes over and navigates autonomously. Enterprise SSO/MFA is handled by you — the agent never touches credentials.
git clone https://github.com/haileyhuber8/scout.git
cd scoutnpm installThis installs Playwright MCP and other dependencies. VS Code will also prompt to install recommended extensions.
Open the repo in VS Code, then open GitHub Copilot Chat and switch to Agent Mode using the mode dropdown at the top of the chat panel. The evaluation agent will automatically activate.
Just start talking:
"I want to evaluate the onboarding experience for my VS Code extension"
The Architect agent will guide you through:
- Defining what to test — experience type, scope, audience, evaluation question
- Discovering primitives — the quality dimensions to score against
- Reviewing the plan — full evaluation plan before launch
- Tool setup — verifying all needed tools are ready
- Running the evaluation — walkthrough + research + data analysis
- Delivering the report — synthesized findings with prioritized recommendations
- Filing issues — optionally create GitHub issues for each recommendation
When the Architect asks how you'd like to set up:
- 🚀 Quick Start (~2 min) — smart defaults, review everything in one shot
- 🔧 Full Customization (~10-15 min) — configure each setting individually
Most users start with Quick Start and customize from there.
Check out examples/azure-portal/ for a completed evaluation — including the primitives spec and final report — to see what Scout produces.
You → Architect (what to test) → Tool Installer (setup check)
↓
┌─────────────────────────┼─────────────────────────┐
↓ ↓ ↓
Experience Walker Researcher Data Analyst
(walks the experience) (user sentiment + (telemetry +
competitive intel) funnel analysis)
└─────────────────────────┼─────────────────────────┘
↓
Architect validates signals
↓
Report Writer synthesizes
↓
Prioritized report + issues
Each evaluation creates a project folder with all artifacts:
my-experience/
├── PRIMITIVES.md # What you're testing and why
├── README.md # Evaluation plan
├── project.json # Configuration
├── experience/
│ ├── walkthroughs/ # Step-by-step walkthrough notes
│ └── screenshots/ # Visual evidence
├── research/
│ ├── sentiment.md # User feedback analysis
│ ├── competitor-analysis/
│ └── user-feedback/
├── data/
│ ├── telemetry/ # Raw data queries and results
│ ├── baselines/ # Comparison baselines
│ └── funnel-analysis/ # Funnel drop-off analysis
├── analysis/ # Cross-cutting analysis
├── output/
│ └── report.md # THE REPORT — the primary deliverable
└── assets/ # Supporting files
Reports adapt to your audience and available data:
| Audience | Depth | What's Included |
|---|---|---|
| Leadership | Executive (~150 lines) | Summary, scorecard, P0 recs, competitive headline |
| PM / Designer | Standard (~500 lines) | All sections, moderate detail |
| Engineering | Deep Dive (~900 lines) | Full analysis, all appendices, data queries |
Reports gracefully degrade when signals aren't available — a 2-signal (walkthrough + research) report is still a full, valuable report. No fake data, no empty sections.
scout/
├── .github/
│ ├── agents/
│ │ └── evaluation.agent.md # The evaluation framework spec (agent instructions)
│ └── copilot-instructions.md # Fallback guidance for non-Agent-Mode users
├── .vscode/
│ ├── mcp.json # Pre-configured MCP servers
│ ├── tasks.json # Auto-install on folder open
│ └── extensions.json # Recommended VS Code extensions
├── _template/ # Template for new evaluations
│ ├── project.json
│ ├── PRIMITIVES.md
│ ├── README.md
│ └── (directory structure)
├── examples/ # Reference evaluations
│ └── azure-portal/ # Completed example (primitives + report)
├── _meta/
│ ├── impact.json # Framework usage tracking
│ └── config.json # Framework configuration
├── package.json # Dependencies (Playwright MCP)
└── README.md # This file
Scout records evaluation metadata in _meta/impact.json after each run — evaluations completed, findings count, and issues filed. This is lightweight tracking to help you measure Scout's value over time.
Note: Impact tracking is recorded automatically but the dashboard view (
_meta/IMPACT.md) is not yet generated automatically. Check_meta/impact.jsondirectly for raw metrics.
This is a private repository. To suggest changes, open an issue or submit a pull request.
Private — not for redistribution.