FeatureBased Skill Generator Agent

Turn any Java repo into AI-readable instruction files — once — so GitHub Copilot and Claude answer feature questions correctly the first time, every time, without burning your premium-request budget.

Status — internal beta. Use freely on real Java repos; expect rough edges. See docs/release-readiness-checklist.md for what must land before this is recommended for unsupervised enterprise-wide rollout (notably: domain-safe source matching for duplicate class names, chunk-and-merge for very large domains, stronger end-to-end verification, and multi-repo orchestration).

The problem this solves

In a typical enterprise Java shop:

A developer has ~300 GitHub Copilot premium requests per month.
The repo has 50–200 business features spread across Controller → Service → DAO → DB.
Every time the developer asks Copilot "how does the Invoice Compare feature work?" or "add a new status to File Delivery", Copilot has no persistent context. The dev re-types it. Or Copilot guesses, gets it wrong, and the dev iterates — burning premium requests on inaccurate answers.

Across many features × many developers, this is a major productivity tax. Most of those premium calls are spent re-explaining the same domain knowledge over and over.

Skills — small, accurate, AI-readable instruction files (one SKILL.md per business feature, committed to your repo) — solve this. Once a skill exists, Copilot and Claude read it automatically and start every conversation with accurate feature context. No re-explaining. Fewer iterations. Premium requests go further.

This repo contains the agent that generates and maintains those skill files for you.

What you get

Without skills	With skills (this agent)
Copilot re-discovers your domain on every prompt	Copilot starts with the feature's full context already loaded
5–8 premium calls per feature question (back-and-forth)	1 premium call, correct answer first time
New hires take weeks to learn each feature	New hires read the SKILL.md and start contributing
Copilot hallucinates status enums, wrong endpoint paths, wrong DTO fields	Copilot cites the actual `ClassName.methodName()` for every rule
You re-explain the FileDelivery state machine to Copilot 47 times a month	You explain it once — to the agent, which writes the skill

How it works

The agent is host-agent-driven: the Python tool walks the repo, builds prompts, and parses responses — but it never makes outbound API calls. The LLM reasoning happens inside whatever AI session you already use (Claude Code, Codex, GitHub Copilot Chat, Claude Cowork), so it costs nothing beyond the subscription you already pay for.

Each LLM-dependent stage has two halves: *-emit writes a prompt file, you paste it into your AI session, save the response, and *-ingest turns the response into the canonical artifact.

┌──────────────────────────────────────────────────────────────────────┐
│                        FIRST RUN (one-time)                          │
│                                                                      │
│  Stage 1: Crawl            (zero LLM turns — pure local parsing)     │
│  Stage 2: Plan             (plan-emit  →  AI session  →  plan-ingest)│
│  Stage 3: Generate         (generate-emit  →  AI session  →  ingest) │
│  Stage 4: Link             (link-emit  →  AI session  →  link-ingest)│
└──────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
                  ┌─────────────────────────────────┐
                  │  .github/skills/                │
                  │  ├── order-management/SKILL.md  │
                  │  ├── consumer-management/...    │
                  │  └── delivery-management/...    │
                  └─────────────────────────────────┘
                                  │
                                  ▼
┌──────────────────────────────────────────────────────────────────────┐
│                  PHASE 2 — INCREMENTAL UPDATES                       │
│                                                                      │
│  On every PR merge or local change:                                  │
│    git diff → map changed files to feature → update-emit             │
│    → AI session generates updated SKILL.md → update-ingest           │
│    → bump version → commit                                           │
└──────────────────────────────────────────────────────────────────────┘

Why this shape? The Python tool is fully deterministic — file walking, parsing, source assembly, response application. The host AI agent does the reasoning. Nothing in this repo talks to the network; nothing requires an API key.

For a visual sequence diagram of the IDE-side developer experience — from typing "analyze this project" through committed SKILL.md files — see docs/agent-invocation-flow.md. For enterprise rollout guidance across VS Code, IntelliJ, Copilot, Claude, and Codex, see docs/enterprise-agent-selection-guide.md.

Enterprise agent selection

The tool itself has no model setting. The selected host AI session supplies the reasoning, so teams can run the same emit/ingest workflow from Claude, Codex, Copilot Chat, or another approved IDE assistant.

For most enterprise teams:

Workload	Recommended host session
First run on an unknown or legacy repo	Strongest available reasoning session, such as Claude Opus-class or Codex high-reasoning
First run on a clean Spring Boot service	Claude Sonnet-class, Codex, or another capable approved session
Incremental update for one reviewed feature	Sonnet-class, Codex, or Copilot Chat
Everyday feature questions after skills are committed	GitHub Copilot Chat in VS Code/IntelliJ, Claude, or Codex reading `.github/skills`

The recommended operating model is centralized: repo owners or feature leads spend the initial generation turns once, commit the generated skills, and let every developer benefit from the shared feature context during daily work.

Quick start

Prerequisites

Python 3.10+ (python3 --version)
An AI session you already use — any of: Claude Code, GitHub Copilot Chat, Codex, Claude Cowork
The Java repo you want to document, checked out locally

No API keys. No third-party Python dependencies. No outbound network calls.

Install

git clone https://github.com/bipinhcs11/Customized_Agent_For_Developer.git
cd Customized_Agent_For_Developer
# That's it.

Not sure if it'll work on your repo? Run `doctor` first

python3 -m tools.skill_generator.cli doctor /path/to/your/repo

A 30-second look at your repo before you commit to anything. Shows class count, detected framework, oversized files, and how long the full pipeline will take. No AI turns, nothing written to disk. See docs/skill-gen-doctor.md for an example.

Run the pipeline

Each LLM stage is two commands with a paste in between. Stage 1 (Crawl) has no LLM step — it just walks the repo.

TARGET=/path/to/your/java/repo

# Stage 1 — fast, deterministic, no AI
python3 -m tools.skill_generator.cli crawl "$TARGET" \
    --output "$TARGET/.skill-gen/.index.json"

# Stage 2 — Plan
python3 -m tools.skill_generator.cli plan-emit "$TARGET/.skill-gen/.index.json"
# Open .skill-gen/plan-prompt.md, paste it into your AI session.
# Save the response as .skill-gen/plan-response.md.
python3 -m tools.skill_generator.cli plan-ingest "$TARGET/.skill-gen/plan-response.md"

# Stage 3 — Generate (one prompt per domain)
python3 -m tools.skill_generator.cli generate-emit \
    "$TARGET/.skill-gen/.plan.json" --repo "$TARGET"
# For each .skill-gen/.generate-prompts/<domain>.md, paste into the AI session.
# Save each response as .skill-gen/.generate-responses/<domain>.md.
python3 -m tools.skill_generator.cli generate-ingest \
    "$TARGET/.skill-gen/.plan.json" --repo "$TARGET"

# Stage 4 — Link
python3 -m tools.skill_generator.cli link-emit "$TARGET/.github/skills"
# Paste link-prompt.md, save response as link-response.md.
python3 -m tools.skill_generator.cli link-ingest \
    "$TARGET/.skill-gen/link-response.md" --skills-dir "$TARGET/.github/skills"

The final SKILL.md files land in <your-repo>/.github/skills/<domain-id>/SKILL.md. Intermediate prompts and responses live under <your-repo>/.skill-gen/.

Why the emit/ingest dance?

The whole point of the agent is to stop spending premium-request budget. Calling Anthropic from a Python script would mean adding another cost line that competes with your subscription. By emitting prompt files and ingesting responses, the LLM turns happen inside your existing Claude Code / Copilot / Codex session — no separate API spend, no separate key to manage.

Phase 2 — incremental updates

After the first run commits the skills to your repo, refresh them when code changes:

python3 -m tools.skill_generator.cli update-emit --repo .
# Paste each .skill-gen/.update-prompts/<feature>.md into your AI session.
# Save each response as .skill-gen/.update-responses/<feature>.md.
python3 -m tools.skill_generator.cli update-ingest --repo . --commit

The same emit/ingest pattern; the same zero-API-call guarantee.

What a generated SKILL.md looks like

Here's a fragment from the Data Flow section of consumer-management/SKILL.md, generated from the FTGO microservices reference application:

POST /consumers
   |
   v
ConsumerController.create(CreateConsumerRequest)
   |   request.getName() -> PersonName
   v
ConsumerService.create(name)                        @Transactional
   |
   |-- Consumer.create(name)                         <- builds aggregate + ConsumerCreated event list
   |
   |-- consumerRepository.save(rwe.result)           -> Consumer DB (JPA, MySQL)
   |
   |__ domainEventPublisher.publish(Consumer.class, id, rwe.events)
              -> Eventuate Tram outbox -> Kafka topic net.chrisrichardson...Consumer
              + emit ConsumerCreated domain event

@KafkaListener (Tram saga dispatch on channel "consumerService")
   |
   v
ConsumerServiceCommandHandlers.commandHandlers()
   |-- onMessage(ValidateOrderByConsumer.class)
           |__ ConsumerService.validateOrderForConsumer(consumerId, orderTotal)
                   |__ Consumer.validateOrderByConsumer(orderTotal)
                           <- spend rule on the aggregate; throws ConsumerVerificationFailedException

The skill captures the async semantics (@Async, .get() <- blocks), DB destinations (-> Consumer DB), Kafka topic names, exception flow, and side effects (+ emit ConsumerCreated). When Copilot reads this, it knows enough to safely modify validateOrderForConsumer() without breaking the saga reply contract.

See verification-output/ftgo-skills/ in this repo for two complete SKILL.mds generated from the real FTGO codebase — one for consumer-management (19 classes) and one for accounting-authorization (27 classes), with cross-domain saga relationships linked between them.

Verified on a real microservices repo

This agent was end-to-end verified against microservices-patterns/ftgo-application — Chris Richardson's reference Spring Boot microservices app.

Metric	Value
Classes parsed	358
Lines of code analyzed	15,714
Microservice modules	12
Domains identified by Stage 2	9 (one per microservice, mapped 1:1)
Confidence (most domains)	HIGH
Host-agent turns total	~11 (1 plan + 9 generate + 1 link)
Schema conformance	12/12 frontmatter fields, 12/12 body sections, 0 Java code blocks in body
Warnings	0

Full details in verification-output/VERIFICATION_REPORT.md. The verification used an earlier API-call architecture; the prompts and outputs are unchanged — only the delivery mechanism (host agent vs. API) is different.

Supported Java flavors

The agent works on any flavor of Java repo, not just modern Spring Boot. It auto-detects which it is and writes skills that describe whatever the target repo actually uses:

Flavor	Detected by
Spring Boot 2.x / 3.x	`@SpringBootApplication` + annotation-driven REST
Spring MVC	XML wiring or annotation, no `@SpringBootApplication`
Struts 1 / 2	`struts-config.xml` action mappings
Quarkus	`@Path` annotations without `@RestController`
Spring Batch	`@EnableBatchProcessing` or `<job>` elements
Quartz Scheduler	`quartz*.xml` with cron expressions
Raw servlets	`web.xml` URL patterns
Legacy hybrid	`.sql` stored procedures + `.sh` orchestration + Java
Mixed-stack	Multiple of the above in one repo

For legacy apps, the crawler also reads stored procedures (.sql), shell scripts (.sh), Flyway/Liquibase migrations, and Spring Batch job XML — so a feature that lives half in Java and half in a stored proc is documented as one cohesive skill.

Cost model

The pipeline is free to operate — every LLM turn runs inside a session you already pay for.

Stage	Host-agent turns	What happens
Crawl	0	Pure local parsing
Plan	1	One paste-and-respond cycle
Generate	1 per detected domain	Each skill is one focused turn
Link	1	One turn covers all cross-references
First run total	~12–15 turns for a 10-domain repo	Roughly linear in domain count
Phase 2 update	1–2 turns per PR	Only changed features re-generate

Compare to the alternative without skills: a developer asks 5 feature questions a day × 200 working days × 10 developers × ~3 premium calls per question due to context misses = ~30,000 premium requests/year per team spent on context re-discovery. With skills in place, those same 5 questions a day land correctly on the first try — and the skills themselves cost zero subscription dollars to produce.

Project layout

.
├── README.md                          ← This file
├── AGENT.md                           ← Full pipeline specification
├── CLAUDE.md                          ← Cowork / Claude Code project config
├── OPUS_PROMPT.md                     ← Original problem statement
├── .github/
│   └── copilot-instructions.md        ← Tells Copilot to read skills before answering
│
├── tools/
│   └── skill_generator/               ← THE AGENT (Python, stdlib only)
│       ├── cli.py                     ← CLI entry point (emit/ingest subcommands)
│       ├── crawler.py                 ← Stage 1 (zero LLM turns)
│       ├── prompts.py                 ← All prompt strings (single source of truth)
│       ├── plan.py                    ← Stage 2 (emit_prompt / ingest_response)
│       ├── generate.py                ← Stage 3 (per-domain emit / ingest)
│       ├── link.py                    ← Stage 4 (emit_prompt / ingest_response)
│       ├── update.py                  ← Phase 2 incremental updater
│       └── README.md                  ← Internal module docs
│
├── skills/                            ← Reference skills (the quality bar)
│   ├── file-delivery/SKILL.md
│   ├── invoice-compare/SKILL.md
│   ├── payment-method-determination/SKILL.md
│   └── skill-generator/
│       └── references/
│           └── data-flow-example.md   ← Pattern for the rich Data Flow section
│
├── examples/                          ← Reference Java code (illustrative only)
│   ├── file-delivery/                 ← Spring Boot controller/service/dao/sql
│   ├── invoice-compare/
│   ├── payment-method-determination/
│   └── legacy-forward-generator/      ← Historical: old code-gen templates
│
├── verification-output/               ← Proof the agent works end-to-end
│   ├── VERIFICATION_REPORT.md
│   ├── ftgo-crawl-index.json
│   ├── ftgo-plan.json
│   └── ftgo-skills/
│       ├── consumer-management-SKILL.md
│       ├── consumer-management-SKILL-v2-rich-dataflow.md
│       ├── accounting-authorization-SKILL.md
│       └── cross-domain-links.json
│
└── docs/
    └── design-history/                ← Design notes for contributors
        └── CODEX_REVIEW_PROMPT.md

How GitHub Copilot uses the generated skills

After your first run, your target repo has a .github/skills/<domain-id>/SKILL.md for each feature and a .github/copilot-instructions.md that tells Copilot to read those skills before answering.

Example developer interaction:

Developer: "Add a new ARCHIVED status to File Delivery"

Copilot (reading file-delivery/SKILL.md automatically): I see the current FileDeliveryStatus enum has PENDING, SCANNING, READY, DELIVERED, ACKNOWLEDGED, SCAN_FAILED, EXPIRED, DELETED. The skill notes that isDeliverable() returns true only for READY, and isTerminal() returns true for SCAN_FAILED, EXPIRED, DELETED. To add ARCHIVED:

Add to FileDeliveryStatus enum

Add to isTerminal() if archived files should be excluded from delivery flow

Add a migration: ALTER TABLE file_delivery ALTER COLUMN status SET DEFAULT 'PENDING'; (the VARCHAR allows the new value)

Update idx_file_delivery_status if filtering archived rows needs an index

That answer is correct, complete, and arrived in one prompt — because Copilot read the skill before responding. Without the skill, that same answer takes 5–8 back-and-forth prompts to assemble.

Configuration

The agent's defaults work for most repos. Override via CLI flags:

Flag	Default	What it does
`--output` / `-o`	varies by subcommand	Where to write the prompt / artifact
`--output-dir`	`<repo>/.github/skills/`	Where SKILL.mds land (generate-ingest)
`--prompts-dir`	`<repo>/.skill-gen/.generate-prompts/`	Where per-domain emit prompts land
`--responses-dir`	`<repo>/.skill-gen/.generate-responses/`	Where to look for per-domain responses
`--exclude`	(see `crawler.py`)	Additional directories to skip in crawl
`--skip-tests`	off	Exclude `*Test.java` and `/test/` paths
`--force`	off	Overwrite an existing SKILL.md on ingest
`--only DOMAIN_ID`	(all)	Restrict emit/ingest to one domain
`--commit`	off	(update-ingest) git-add + commit the refreshed SKILL.mds

What this is NOT

So nobody starts with the wrong expectation:

Not a forward code generator. "Given a feature name, write Controller + Service + DAO + DDL" is not the job. The agent reads existing code and writes instruction files about it.
Not a documentation generator for human readers. The output is AI-readable. Tables and cited rules are tuned for AI consumption, not human reading flow.
Not tied to specific business domains. The three sample skills in skills/ (File Delivery / Invoice Compare / Payment Method Determination) are illustrations of the format, not the agent's deliverable set. The agent ships for whatever features exist in whatever repo you point it at.

Roadmap

What's in v0.3 (now):

All four pipeline stages working end-to-end via emit/ingest
Phase 2 incremental updater (git-diff-based)
Crawler handles Java + XML + properties + YAML + SQL + shell
Python CLI with crawl / plan-emit / plan-ingest / generate-emit / generate-ingest / link-emit / link-ingest / update-emit / update-ingest
Zero outbound network calls; no API key required
Verified end-to-end against FTGO microservices reference (under earlier API architecture; prompts unchanged)

What's coming next:

Multi-repo orchestration — config-driven runs across 50+ enterprise repos in one pass
Chunk-and-merge for very large domains — Stage 3 currently truncates domains > 24KB of source; real chunk-merge needs implementation
Real Java AST parsing — optional javalang dependency to replace the regex parser for edge cases (Lombok, annotation processors)
Web UI for plan review — instead of editing plan.json by hand, click-to-approve domains in a browser before Stage 3 runs

FAQ

Does this require an Anthropic API key? No. The tool never makes outbound network calls. Every LLM turn happens inside an AI session you already use (Claude Code, GitHub Copilot Chat, Codex, Claude Cowork). The cost to operate the agent is your normal subscription — nothing extra.

Will this work on my legacy monolith with stored procedures and shell scripts? Yes — the crawler reads .sql, .sh, Flyway/Liquibase migrations, and Spring Batch job XML alongside Java. The generated SKILL.md describes whatever the target repo actually uses.

Does it generate Java code? No. The agent emits SKILL.md instruction files. Java code generation tools can consume these skills as input (and produce better code because of it), but that's downstream of this agent's job.

What if my Java is parsed badly? The crawler is regex-based, which is fast and dependency-free but has edge cases (Lombok-generated code, exotic generics). For most repos it works fine. If accuracy matters more than speed, a future version will use javalang for full AST parsing.

How do I review the plan before Stage 3 runs? You always do — the emit/ingest split makes plan review the default. After plan-ingest writes plan.json, edit the domains[] array (remove domains you don't want, rename ids, merge domains) before running generate-emit. No way to skip review even if you wanted to.

What if my repo has 5000 classes? The Plan stage's prompt scales with index size. At ~5000 classes the index is ~500KB — still within Claude's context window but worth chunking. Workaround for now: run the crawler on subdirectories separately and merge plans manually. Multi-pass planning is on the roadmap.

Can I customize the SKILL.md format? The format is defined in tools/skill_generator/prompts.py. Edit STAGE_3_GENERATE_PROMPT to change what sections appear or what each one requires. The default is the artifact-3 standard from this project's design history.

Contributing

The agent's prompts are the load-bearing part. If you find the generated SKILL.mds are missing something, or you have a richer pattern from your own enterprise (like the rich Data Flow style in skills/skill-generator/references/data-flow-example.md), the highest-impact contribution is sharpening the prompts in tools/skill_generator/prompts.py.

The Python is intentionally stdlib-only and ~1500 lines total — easy to audit, modify, and extend.

For the rationale behind the design decisions, see OPUS_PROMPT.md (original problem statement) and docs/design-history/CODEX_REVIEW_PROMPT.md (cross-model design review).

License

MIT — see LICENSE at the repo root.

Acknowledgments

Design informed by Chris Richardson's microservices.io reference apps and patterns. End-to-end verification ran against ftgo-application. The SKILL.md schema and pipeline shape were prototyped across multiple Claude conversations summarized in OPUS_PROMPT.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FeatureBased Skill Generator Agent

The problem this solves

What you get

How it works

Enterprise agent selection

Quick start

Prerequisites

Install

Not sure if it'll work on your repo? Run `doctor` first

Run the pipeline

Why the emit/ingest dance?

Phase 2 — incremental updates

What a generated SKILL.md looks like

Verified on a real microservices repo

Supported Java flavors

Cost model

Project layout

How GitHub Copilot uses the generated skills

Configuration

What this is NOT

Roadmap

FAQ

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
docs		docs
examples		examples
skills		skills
tests		tests
tools/skill_generator		tools/skill_generator
verification-output		verification-output
.gitignore		.gitignore
AGENT.md		AGENT.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
OPUS_PROMPT.md		OPUS_PROMPT.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

FeatureBased Skill Generator Agent

The problem this solves

What you get

How it works

Enterprise agent selection

Quick start

Prerequisites

Install

Not sure if it'll work on your repo? Run doctor first

Run the pipeline

Why the emit/ingest dance?

Phase 2 — incremental updates

What a generated SKILL.md looks like

Verified on a real microservices repo

Supported Java flavors

Cost model

Project layout

How GitHub Copilot uses the generated skills

Configuration

What this is NOT

Roadmap

FAQ

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Not sure if it'll work on your repo? Run `doctor` first

Packages