Scout — E2E Experience Evaluation Framework

Scout is a structured, AI-powered framework for running end-to-end experience evaluations. Point it at any product experience — a web portal, VS Code extension, CLI tool, AI agent, or API — and it delivers an actionable report with prioritized recommendations backed by real evidence.

What Scout Does

Scout orchestrates a team of specialized AI agents that evaluate your experience from three angles:

Signal	Agent	What It Does
🔍 Walkthrough	Experience Walker	Walks through your experience step-by-step, scoring against user-defined quality primitives
💬 Research	Researcher	Gathers user sentiment from GitHub issues, forums, and community + deep competitive intelligence
📊 Data	Data Analyst	Analyzes telemetry and funnel drop-offs (requires Kusto/ADX access — see note below)
📋 Actionable Next Steps	Report Writer	Delivers a prioritized list of improvements based on the findings — each with the problem, a concrete fix, the impact it drives, and what to start on first

These signals are synthesized by the Report Writer into a comprehensive evaluation report. The most valuable output is the prioritized next steps — a ranked list of recommendations (P0 → P3) where each one explains the problem, suggests a specific fix, and quantifies the impact so teams know exactly what to work on and why.

📊 Data signal availability: The Data Analyst requires access to a Kusto/Azure Data Explorer cluster with telemetry data. Without it, Scout runs as a 2-signal framework (walkthrough + research) and delivers a Data Strategy Memo suggesting what to measure. The 2-signal report is still a complete, valuable evaluation — no fake data or empty sections.

What You Get

Primitives Scorecard — every quality dimension rated 1–5 with evidence
Funnel Analysis — where users drop off and why (requires Kusto/ADX access; gracefully degrades to a Data Strategy Memo)
Competitive Intelligence — how competitors approach the same problem, innovation signals, market direction
Cross-Signal Correlations — connections between walkthrough findings, user sentiment, and data (depth depends on available telemetry access; report adapts when unavailable)
Prioritized Recommendations — P0 through P3, each with problem + fix + impact
GitHub Issues — optionally file every recommendation as a tracked issue

Supported Experience Types

Type	Examples	Primary Tools
Portal / Web UI	Azure Portal, dashboards, admin consoles	Playwright MCP
VS Code Extension	Extensions in VS Code desktop or web	Playwright MCP (web), user-assisted walkthrough (desktop)
CLI	`az`, `gh`, `npm`, developer CLIs	Terminal
Skill / Agent	AI agents, Copilot skills	Waza, Terminal
API / SDK	REST APIs, client SDKs	Terminal, HTTP tools

Tools Included

Scout ships with tools pre-configured and recommends additional ones based on your evaluation:

Pre-Installed (ready out of the box)

Tool	What It Does	How It's Configured
Playwright MCP	Browser automation — walks web UIs, takes screenshots, interacts with elements	npm dependency + MCP server config
Terminal	Command execution for CLI evaluations and data collection	Built-in to VS Code

Recommended (prompted during setup)

Tool	What It Does	How to Get It
Kusto Workbench	KQL queries against Azure Data Explorer for telemetry analysis	VS Code Extension (auto-recommended)
GitHub CLI	Issue filing (Phase 5), repo search, user feedback collection	System install: `winget install GitHub.cli` / `brew install gh`
Waza	Skill/Agent invocation and evaluation	Go binary (manual)

Built-In to GitHub Copilot

Tool	What It Does
fetch_webpage	Fetches web pages for research — forums, docs, competitor sites
Semantic Search	Searches the codebase for relevant patterns

Prerequisites

Node.js >= 18.0.0 — required for Playwright MCP
VS Code with GitHub Copilot — the agents run in Copilot Agent Mode
GitHub CLI — optional, needed for issue filing (Phase 5)

⚠️ Copilot Agent Mode is required. Scout's agents run entirely in GitHub Copilot's Agent Mode. Make sure you have an active Copilot subscription and that Agent Mode is enabled in your VS Code settings.

Workspace Trust: When you first open the Scout repo, VS Code may ask you to trust the workspace. Click "Yes, I trust the authors" — this is required for the auto-install task and MCP server to function.

🔐 Authentication for portals: For experiences that require login (Azure Portal, admin consoles, etc.), Scout will open the browser and ask you to log in once manually. After that, the agent takes over and navigates autonomously. Enterprise SSO/MFA is handled by you — the agent never touches credentials.

Getting Started

1. Clone the repo

git clone https://github.com/haileyhuber8/scout.git
cd scout

2. Install dependencies

npm install

This installs Playwright MCP and other dependencies. VS Code will also prompt to install recommended extensions.

3. Start an evaluation

Open the repo in VS Code, then open GitHub Copilot Chat and switch to Agent Mode using the mode dropdown at the top of the chat panel. The evaluation agent will automatically activate.

Just start talking:

"I want to evaluate the onboarding experience for my VS Code extension"

The Architect agent will guide you through:

Defining what to test — experience type, scope, audience, evaluation question
Discovering primitives — the quality dimensions to score against
Reviewing the plan — full evaluation plan before launch
Tool setup — verifying all needed tools are ready
Running the evaluation — walkthrough + research + data analysis
Delivering the report — synthesized findings with prioritized recommendations
Filing issues — optionally create GitHub issues for each recommendation

Quick Start

When the Architect asks how you'd like to set up:

🚀 Quick Start (~2 min) — smart defaults, review everything in one shot
🔧 Full Customization (~10-15 min) — configure each setting individually

Most users start with Quick Start and customize from there.

See an Example

Check out examples/azure-portal/ for a completed evaluation — including the primitives spec and final report — to see what Scout produces.

How It Works

You → Architect (what to test) → Tool Installer (setup check)
                                        ↓
              ┌─────────────────────────┼─────────────────────────┐
              ↓                         ↓                         ↓
     Experience Walker           Researcher              Data Analyst
     (walks the experience)   (user sentiment +       (telemetry +
                               competitive intel)      funnel analysis)
              └─────────────────────────┼─────────────────────────┘
                                        ↓
                              Architect validates signals
                                        ↓
                              Report Writer synthesizes
                                        ↓
                              Prioritized report + issues

Each evaluation creates a project folder with all artifacts:

my-experience/
├── PRIMITIVES.md          # What you're testing and why
├── README.md              # Evaluation plan
├── project.json           # Configuration
├── experience/
│   ├── walkthroughs/      # Step-by-step walkthrough notes
│   └── screenshots/       # Visual evidence
├── research/
│   ├── sentiment.md       # User feedback analysis
│   ├── competitor-analysis/
│   └── user-feedback/
├── data/
│   ├── telemetry/         # Raw data queries and results
│   ├── baselines/         # Comparison baselines
│   └── funnel-analysis/   # Funnel drop-off analysis
├── analysis/              # Cross-cutting analysis
├── output/
│   └── report.md          # THE REPORT — the primary deliverable
└── assets/                # Supporting files

Evaluation Reports

Reports adapt to your audience and available data:

Audience	Depth	What's Included
Leadership	Executive (~150 lines)	Summary, scorecard, P0 recs, competitive headline
PM / Designer	Standard (~500 lines)	All sections, moderate detail
Engineering	Deep Dive (~900 lines)	Full analysis, all appendices, data queries

Reports gracefully degrade when signals aren't available — a 2-signal (walkthrough + research) report is still a full, valuable report. No fake data, no empty sections.

Project Structure

scout/
├── .github/
│   ├── agents/
│   │   └── evaluation.agent.md    # The evaluation framework spec (agent instructions)
│   └── copilot-instructions.md    # Fallback guidance for non-Agent-Mode users
├── .vscode/
│   ├── mcp.json               # Pre-configured MCP servers
│   ├── tasks.json             # Auto-install on folder open
│   └── extensions.json        # Recommended VS Code extensions
├── _template/                 # Template for new evaluations
│   ├── project.json
│   ├── PRIMITIVES.md
│   ├── README.md
│   └── (directory structure)
├── examples/                  # Reference evaluations
│   └── azure-portal/          # Completed example (primitives + report)
├── _meta/
│   ├── impact.json            # Framework usage tracking
│   └── config.json            # Framework configuration
├── package.json               # Dependencies (Playwright MCP)
└── README.md                  # This file

Impact Tracking

Scout records evaluation metadata in _meta/impact.json after each run — evaluations completed, findings count, and issues filed. This is lightweight tracking to help you measure Scout's value over time.

Note: Impact tracking is recorded automatically but the dashboard view (_meta/IMPACT.md) is not yet generated automatically. Check _meta/impact.json directly for raw metrics.

Contributing

This is a private repository. To suggest changes, open an issue or submit a pull request.

License

Private — not for redistribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scout — E2E Experience Evaluation Framework

What Scout Does

What You Get

Supported Experience Types

Tools Included

Pre-Installed (ready out of the box)

Recommended (prompted during setup)

Built-In to GitHub Copilot

Prerequisites

Getting Started

1. Clone the repo

2. Install dependencies

3. Start an evaluation

Quick Start

See an Example

How It Works

Evaluation Reports

Project Structure

Impact Tracking

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
.vscode		.vscode
_meta		_meta
_template		_template
assets		assets
azure-ai-foundry		azure-ai-foundry
examples/azure-portal		examples/azure-portal
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

Scout — E2E Experience Evaluation Framework

What Scout Does

What You Get

Supported Experience Types

Tools Included

Pre-Installed (ready out of the box)

Recommended (prompted during setup)

Built-In to GitHub Copilot

Prerequisites

Getting Started

1. Clone the repo

2. Install dependencies

3. Start an evaluation

Quick Start

See an Example

How It Works

Evaluation Reports

Project Structure

Impact Tracking

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages