[Labelling Health] Labelling Health Report — 2026-06-03

### Summary

- **Overall status: mixed**
- The prediction pipeline is active and applying label changes, but the correction feedback loop has been silent for over 6 weeks (last correction signal dates from 2026-04-18). Daily summary counts did not parse into structured fields, limiting automated trend comparison.

---

### Key Metrics

| Metric | Value |
|---|---|
| Discussions reviewed, last 7 days | ~13 (inferred from summary bodies) |
| Label changes applied, last 7 days | ~9 (inferred from summary bodies) |
| Change rate, last 7 days | ~69% (9 / 13) |
| Previous 7-day window data | None available (no older summaries loaded) |
| Correction-intake (Collect Corrections) runs, last 7 days | 30 runs — 14 succeeded, 16 cancelled/failed |
| Predict Labels runs, last 7 days | 4 (all succeeded) |
| Persisted prediction snapshots | 0 |
| Open correction signals | 0 |
| Correction signals created, last 7 days | 0 |
| Correction signals created, last 30 days | 0 |

> **Note:** The `reviewed` and `changed` fields in the daily summaries did not parse into the health data's structured fields (both show `null`). Counts above are manually extracted from summary body text. Trend comparison with the previous 7-day window is not possible — no summaries from that period were included.

---

### Correction Pressure

No new correction signals have been created in the last 30 days. All 335 historical signals are closed. The newest signal (`#404`) dates from **2026-04-18** — over six weeks ago.

Historical label pressure (across all 335 closed signals) was concentrated in:

<details><summary>Historical label distribution (all closed signals)</summary>

| Label | Signal Count |
|---|---|
| Copilot | 56 |
| Copilot in GitHub | 42 |
| GitHub Education | 38 |
| bug | 38 |
| Other Features and Feedback | 20 |
| question | 14 |
| Apps API and Webhooks | 10 |
| Profile | 10 |
| Mobile | 10 |
| Product Feedback | 9 |

</details>

Current snapshot-backed prediction pressure cannot be computed: no prediction snapshots are persisted and no truth diffs exist. This means underprediction and overprediction breakdowns are unavailable for this cycle.

The ~69% label-change rate observed in the last 7 days is notable. Without correction signals or snapshot truth comparisons, it is not possible to determine whether these changes reflect genuine improvement or systematic over-labelling.

---

### Open Instruction Debt

The correction backlog is at zero — all 335 signals are closed and all 5 correction parent intake issues are closed. No new signals have been filed in over 45 days.

This could mean:
1. The labelling system is performing well and human reviewers are not finding errors worth flagging.
2. The correction collection pipeline (Collect Corrections) is not surfacing errors — only 14 of 30 runs in the last 7 days succeeded; 16 were cancelled or failed.
3. Discussions are not being reviewed by humans at a rate that generates correction volume.

The correction backlog is not stale — it is empty — but the **absence of new signals over 45 days** is itself a signal worth investigating.

---

### Recommendations

1. **Investigate why daily summary `reviewed`/`changed` counts do not parse into structured fields.** Three consecutive summaries returned `null` for both fields despite having numeric data in their bodies. This blocks automated health trending and should be fixed in the summary parsing step.

2. **Investigate the Collect Corrections run failure rate.** Only 14 of 30 runs in the last 7 days completed successfully; 16 were cancelled or failed. If correction collection is unreliable, new human corrections may be silently dropped.

3. **Verify that human correction signals are expected to be absent.** If no new correction parent issues have been opened since April 2026, confirm whether the community review process is still active. If it has paused, the health signal from correction pressure will remain artificially zero.

4. **Enable prediction snapshot persistence.** Zero snapshots are stored, which prevents snapshot-backed truth diff analysis. Without this, overprediction and underprediction pressure by label cannot be computed and `adaptation` mode has no deterministic artifact to act on.

---

<details><summary>Recent daily summary issues</summary>

| Issue | Date | Reviewed (body) | Changed (body) |
|---|---|---|---|
| `#453` | 2026-06-02 | 11 | 7 |
| `#450` | 2026-05-31 | 1 | 1 |
| `#445` | 2026-05-27 | 1 | 1 |

</details>

<details><summary>Open correction signal breakdown</summary>

No open correction signals. All 335 signals are closed. Last signal (`#404`) was closed in April 2026.

</details>

<details><summary>Recent workflow run references</summary>

| Workflow | Run | Status |
|---|---|---|
| Predict Labels | [§6](https://github.com/githubnext/aw-community-ops/actions/runs/26803918136) | success |
| Collect Corrections | [§40](https://github.com/githubnext/aw-community-ops/actions/runs/26804162279) | success |
| Review Health | [§8](https://github.com/githubnext/aw-community-ops/actions/runs/26864197359) | in_progress |

</details>

### References

- [§6](https://github.com/githubnext/aw-community-ops/actions/runs/26803918136) — Predict Labels, most recent run
- [§40](https://github.com/githubnext/aw-community-ops/actions/runs/26804162279) — Collect Corrections, most recent successful run
- [§8](https://github.com/githubnext/aw-community-ops/actions/runs/26864197359) — Review Health, current run




> Generated by [Review Health](https://github.com/githubnext/aw-community-ops/actions/runs/26864197359) · sonnet46 1.1M · [◷](https://github.com/search?q=repo%3Agithubnext%2Faw-community-ops+is%3Aissue+%22gh-aw-workflow-call-id%3A+githubnext%2Faw-community-ops%2Freview-health%22&type=issues)
> - [x] expires  on Jul 3, 2026, 4:49 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Labelling Health] Labelling Health Report — 2026-06-03 #455

Summary

Key Metrics

Correction Pressure

Open Instruction Debt

Recommendations

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value
Discussions reviewed, last 7 days	~13 (inferred from summary bodies)
Label changes applied, last 7 days	~9 (inferred from summary bodies)
Change rate, last 7 days	~69% (9 / 13)
Previous 7-day window data	None available (no older summaries loaded)
Correction-intake (Collect Corrections) runs, last 7 days	30 runs — 14 succeeded, 16 cancelled/failed
Predict Labels runs, last 7 days	4 (all succeeded)
Persisted prediction snapshots	0
Open correction signals	0
Correction signals created, last 7 days	0
Correction signals created, last 30 days	0

Label	Signal Count
Copilot	56
Copilot in GitHub	42
GitHub Education	38
bug	38
Other Features and Feedback	20
question	14
Apps API and Webhooks	10
Profile	10
Mobile	10
Product Feedback	9

Workflow	Run	Status
Predict Labels	§6	success
Collect Corrections	§40	success
Review Health	§8	in_progress

[Labelling Health] Labelling Health Report — 2026-06-03 #455

Description

Summary

Key Metrics

Correction Pressure

Open Instruction Debt

Recommendations

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions