GitHub - lukifer23/AnarchoBot: 138M param ChatML training stack optimized for Apple Silicon via MLX. Features a curated Quality2K continuation curriculum and v18 SFT alignment.

AnarchoBot (MLX, Apple Silicon)

138M ChatML training stack for Apple Silicon using MLX.

Canonical remote branch: main. Historical legacy-default state is preserved at tag archive/origin-master-2026-03-20.

Active Workflow

The repo now treats one path as first-class:

clean Quality2K continuation -> explicitly approved pinned checkpoint -> v19 align/full/repair SFT curriculum

Active entrypoints:

scripts/build_pretrain_quality2k.py
scripts/run_pretrain_quality2k_terminal.sh
scripts/audit_dense_mainline.py
scripts/review_plain_generation.py
scripts/select_quality2k_checkpoint.py
scripts/pin_quality2k_checkpoint.py
scripts/build_sft_v19_release.py
scripts/run_sft_release.py
scripts/run_sft_release_v19.py
scripts/run_multiturn_coherence_eval.py (fixed multi-turn transcript suite; see SFT Runbook)

Research branch entrypoints:

scripts/extend_tokenizer_with_vm_tokens.py
scripts/build_vm_pilot_dataset.py
scripts/init_vm_from_dense.py
scripts/extend_tokenizer_with_wasm_tokens.py
scripts/normalize_local_docs.py
scripts/build_wasm_subset_corpus.py
scripts/build_wasm80m_pretrain_corpus.py
scripts/build_wasm80m_sft_corpora.py
scripts/run_wasm80m_pretrain.py
scripts/run_wasm80m_sft.py
scripts/eval_wasm80m.py

Historical probe-era and experimental material is retained only as archived reference. See Archive Notes.

Historical dense shims:

scripts/build_sft_v18_release.py
scripts/run_sft_release_v18.py
scripts/run_sft_release_v18_terminal.sh These remain compatibility shims only and are non-authoritative for release decisions.

The WASM80m scripts listed under “Research branch entrypoints” are a parallel tokenizer/model line (docs/wasm80m_runbook.md); they are not part of finishing dense 138M v19 chat.

The only architecture on the release path is the dense 138M line. Experimental dense_vm and dense_wasm80m work are isolated to separate branch/config families and do not share checkpoint compatibility with the dense mainline.

Current Artifacts

Preserved raw pretrain base: checkpoints/pretrain_mlx_138m_chatml/mlx_step_130000.pkl
Active continuation config: configs/pretrain_mlx_138m_quality2k.yaml
Active continuation outputs: checkpoints/pretrain_mlx_138m_quality2k
Canonical SFT handoff: checkpoints/pretrain_mlx_138m_quality2k/selected_for_sft.pkl
active v19 SFT configs:
- configs/sft_release_v19_align.yaml
- configs/sft_release_v19_full.yaml
- configs/sft_release_v19_repair.yaml
Canonical chat/eval starting checkpoint (repair stage, step 50): checkpoints/sft_release_v19_repair/sft_step_50.pkl
Symlink pin for that artifact (used by scripts/eval_release_candidate.py by default): checkpoints/sft_release_v19_repair/selected_for_future_work.pkl — must resolve to the same file as sft_step_50.pkl when the pin is current; metadata lives in selected_for_future_work.json.
Eval commands, gate CLI, release bundle, and optional MLX smoke tests: docs/eval.md. Pin promotion, raw_reply vs reply, and gate_report.json retention: docs/sft_runbook.md (sections after Candidate Eval).
Mainline pin metadata for approved selections includes lineage fields: run_id, source_checkpoint, selected_step, gate_report_path, manifest_hash, and mainline_valid.

Setup

python -m venv .venv
source .venv/bin/activate
pip install -e .
PYTHONPATH=src python scripts/setup_verification.py

Canonical Pretrain Continuation

Build the curated continuation corpus:

source .venv/bin/activate
PYTHONPATH=src python scripts/build_pretrain_quality2k.py

The active 138M continuation runtime contract is:

context: 2048 tokens
dropout: 0.0
compile: true
compile_granularity: microbatch
precision: bfloat16
micro_batch_size: 1
grad_accum_steps: 16
gradient_checkpointing: false

Run the continuation from Terminal:

cd /Users/admin/Downloads/VSCode/AnarchoBot
./scripts/run_pretrain_quality2k_terminal.sh

Start a fresh continuation explicitly:

cd /Users/admin/Downloads/VSCode/AnarchoBot
./scripts/run_pretrain_quality2k_terminal.sh --clean-run

Monitor the run:

source .venv/bin/activate
PYTHONPATH=src python scripts/metrics_window.py \
  --log-dir checkpoints/pretrain_mlx_138m_quality2k/logs \
  --config configs/pretrain_mlx_138m_quality2k.yaml

Validate the staged continuation checkpoints before extending the run:

source .venv/bin/activate
PYTHONPATH=src python scripts/validate_mainline_training.py grad-coverage \
  --config configs/pretrain_mlx_138m_quality2k.yaml \
  --checkpoint checkpoints/pretrain_mlx_138m_chatml/mlx_step_130000.pkl

PYTHONPATH=src python scripts/validate_mainline_training.py checkpoint-diff \
  --config configs/pretrain_mlx_138m_quality2k.yaml \
  --start-checkpoint checkpoints/pretrain_mlx_138m_chatml/mlx_step_130000.pkl \
  --end-checkpoint checkpoints/pretrain_mlx_138m_quality2k/mlx_step_11000.pkl

For the completed 12000 continuation run, the preserved candidate pool is 8000, 9000, 10000, 11000, and 12000. Earlier checkpoints rotated out under ckpt_keep: 5.

Canonical SFT Handoff

Select the checkpoint with the deterministic continuation handoff rule:

source .venv/bin/activate
PYTHONPATH=src python scripts/select_quality2k_checkpoint.py \
  --manifest examples/quality2k_selection_manifest.json \
  --print-pin-command

The selector uses held-out perplexity with earliest-step tie-break, and only blocks candidates for checkpoint-diff failure, non-finite/missing perplexity, or catastrophic plain-generation regression versus the base review.

Pin the chosen continuation checkpoint only after the clean rerun validations pass:

source .venv/bin/activate
PYTHONPATH=src python scripts/pin_quality2k_checkpoint.py \
  --checkpoint checkpoints/pretrain_mlx_138m_quality2k/mlx_step_11000.pkl \
  --mainline-valid \
  --artifact-role mainline_candidate \
  --validation-basis "base grad coverage + compile parity passed; checkpoint diff passed; held-out perplexity won preserved 8000-12000 pool; no catastrophic plain-generation regression vs base"

Export a Hugging Face token at runtime before rebuilding the canonical natural-chat slice:

export HF_TOKEN=...

Build the v19 SFT corpora:

source .venv/bin/activate
PYTHONPATH=src python scripts/build_sft_v19_release.py --clean-output

The standalone builder writes reports/sft_v19_release_build/build_summary.json. The shared runner writes per-run build reports under reports/sft_v19_release_builds/<run_id>/build_summary.json.

The latest validated v19 run reported these manifest counts:

align: 3600 examples
release: 22571 examples
eval: 1600 examples
repair: 2912 examples from 3000 selected repair rows after shard filtering

The shared runner now validates manifest_examples against these bands:

align: 3000-5000
release: 20000-28000
eval: >=1280
repair: 2500-3500

Run the v19 curriculum:

cd /Users/admin/Downloads/VSCode/AnarchoBot
PYTHONPATH=src .venv/bin/python scripts/run_sft_release_v19.py

Default v19 release controls include:

dual-track raw/guarded gating
rewrite-rate cap (<=0.15 by default)
one bounded repair extension window (+25 once) before final failure

selected_for_sft.pkl is now blocked from the canonical SFT path unless its sibling metadata file marks it mainline_valid: true.

Run the static dense-mainline audit at any time without touching training:

source .venv/bin/activate
PYTHONPATH=src python scripts/audit_dense_mainline.py \
  --json-output reports/pretrain_quality2k_review/static_dense_audit.json

Tests

source .venv/bin/activate
pip install pytest
PYTHONPATH=src pytest

Optional MLX checkpoint smoke tests (loads weights on GPU, uses checkpoints/sft_release_v19_repair/sft_step_50.pkl unless ANARCHOBOT_CANONICAL_CKPT is set):

ANARCHOBOT_RUN_MLX_TESTS=1 PYTHONPATH=src pytest -m mlx_checkpoint tests/test_canonical_checkpoint.py

Generated Artifact Policy

Repo-tracked content is source, prompts, configs, tests, docs, and curated evidence.

Runtime artifacts are intentionally untracked:

continuation checkpoints
generated shard directories
runtime reports
transient build JSONL/message dumps

Preserved historical evidence lives under legacy_evidence/.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
archive		archive
checkpoints		checkpoints
configs		configs
data		data
docs		docs
examples		examples
research/wasm80m		research/wasm80m
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
HANDOFF_2026-02-23.md		HANDOFF_2026-02-23.md
LICENSE		LICENSE
README.md		README.md
debug_sft.jsonl		debug_sft.jsonl
mlx_pid.txt		mlx_pid.txt
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnarchoBot (MLX, Apple Silicon)

Active Workflow

Current Artifacts

Setup

Canonical Pretrain Continuation

Canonical SFT Handoff

Tests

Generated Artifact Policy

Docs

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AnarchoBot (MLX, Apple Silicon)

Active Workflow

Current Artifacts

Setup

Canonical Pretrain Continuation

Canonical SFT Handoff

Tests

Generated Artifact Policy

Docs

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages