Skip to content

Commit f3ddf40

Browse files
Chore/add pcm definition (#15)
* Auto-initialize the GitHub Wiki repo on first run and publish docs/wiki to .wiki.git, fixing prior YAML corruption. * Rename PCM to `Provenance Composition Model (PCM)` --------- Co-authored-by: settletop-niles <settletop-niles@users.noreply.github.com>
1 parent 3fe64d0 commit f3ddf40

1 file changed

Lines changed: 63 additions & 0 deletions

File tree

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Provenance Composition Model (PCM) — Summary (v1.1.7)
2+
3+
This page gives a concise, user-facing summary of the Provenance Composition Model (PCM) format used by CodeRoot (schema version 1.1.7).
4+
5+
Short definition
6+
7+
- PCM (Provenance Composition Map): a text-free, lines-first event model that records edit operations as newline-delimited JSON records. PCM stores byte ranges, counts, hashes and provenance hints — not source text.
8+
9+
Event kinds (high level)
10+
11+
- PCMEvent (record_type: `pcm_event`)
12+
- Common fields: `schema_version` (1.1.7), `event_id`, `timestamp`/`ts`, `file_path`, `file_id`.
13+
- Note: `actor` (who performed the edit) is required for `pcm_event` per the schema and should be provided when available.
14+
- `op`: edit operation; typical values include `insert`, `replace`, `delete`, `paste`, `ai_apply`, `format`, `rename`, `move`, `tooling`.
15+
- `origin`: authoritatitive categories are `human`, `ai`, or `untracked`. (Readers may accept legacy `observed` and coerce it.)
16+
- `introduced` / `deleted`: size and hash metadata (prefer `lines` over legacy `loc`).
17+
- `before` / `after`: authoritative byte ranges (`startByte`/`endByte`) with optional advisory line/column coordinates.
18+
19+
- PCMCorrectionRecord (record_type: `correction`) — reference to a prior event (`target_event_id`) with optional upgrade hints.
20+
21+
- PCMJournalRecord (record_type: `journal`) — workspace/meta records used by the extension for settings and non-edit bookkeeping.
22+
23+
Snapshots (what readers see)
24+
25+
- Snapshots are materialized per-file as `.pcm.json` under `.coderoot/v1/snapshots/`.
26+
- A minimal snapshot contains: `schema_version`, `file_path`, `file_id`, `updated_at`, `spans` (with `span_id`, `range`, `origin`, `introduced_at`, `last_modified_at`), and a `summary` (including `lines_total`, `lines_by_origin`, `chars_by_origin`, `touched`).
27+
28+
Key rules and guidance (brief)
29+
30+
1. Lines-first, text-free: Writers should emit `introduced.lines` and `chars_total` and a `hash` (e.g., `ws-sha256`) when possible; do not store full file text inside events.
31+
2. Byte ranges authoritative: `before.range` / `after.range` (`startByte`/`endByte`) are the authoritative anchors for byte-clamp and mapping; line/column values are advisory.
32+
3. Backward compatibility: readers should accept historical schema versions (1.1.3 → 1.1.7) and coerce legacy fields (`loc`, `text_sha`) to the 1.1.7 equivalents where feasible.
33+
4. Origin handling: prefer `human`/`ai`/`untracked`. If older journals include `observed` treat it as legacy and map to a best-fit category during migration/reporting.
34+
35+
Compact example event (redacted)
36+
37+
Example: compact PCM event (redacted)
38+
39+
```json
40+
{
41+
"schema_version": "1.1.7",
42+
"event_id": "e-XXXX",
43+
"file_path": "src/example.txt",
44+
"op": "insert",
45+
"origin": "human",
46+
"after": { "range": { "startByte": 0, "endByte": 12 } },
47+
"introduced": { "lines": 2, "chars_total": 12, "hash": { "algo": "ws-sha256", "value": "..." } }
48+
}
49+
```
50+
51+
Safe-use callouts
52+
53+
> Warning: PCM is deliberately text-free for privacy. Avoid adding raw clipboard or source text into events. Keep `privacy.clipboard_store_mode` set to `hash_only` unless you explicitly need text stored and you understand the privacy implications.
54+
55+
How to verify (quick)
56+
57+
1. Perform a small edit (1–2 lines) in a tracked file.
58+
2. Rebuild or let the extension write journals; then run snapshot generation (developer: `npm run force:snapshots` or use extension commands).
59+
3. Inspect `.coderoot/v1/snapshots/<rel>.pcm.json` and confirm `lines_total` and `lines_by_origin` reflect your edits.
60+
61+
Where to read more
62+
63+
- See `docs/spec/1.1.7/PCM_SPEC_1.1.7.md` and the migration notes in `docs/spec/1.1.7/MIGRATION_GUIDE.md` for developer-level details and examples.

0 commit comments

Comments
 (0)