SESHAT — Measuring the Information Limit of Linear A

Seshat — Egyptian goddess of writing, knowledge, and measurement.

An honest computational study of Linear A — a measurement, not a decipherment. Linear A is undeciphered and, from the surviving corpus alone, probably undecipherable: ~1.4k mostly 1–2-sign inscriptions, an unknown language, and no bilingual key. No method — AI, quantum, or human — extracts information that isn't there. So SESHAT asks the answerable question instead:

What can be computationally inferred about Linear A, and where is the information-theoretic limit — validated on Linear B, where we know the answer?

Every method is calibrated on Linear B (deciphered Mycenaean Greek). If a technique can't recover what we already know about Linear B, we don't trust it on Linear A. A rigorous negative result is itself the headline.

Results (the honest headline)

Phase	Question	Result
1 — Information limit	How much structure does the corpus hold?	Linear A `H(next\|prev) ≈ 3.7 bits`, redundancy 31% vs Linear B 63% — far sparser, quantifying why it resists decipherment
2 — GNN sign embeddings	Does pure co-occurrence encode phonology?	On Linear B, recovers vowels (≈36% vs 24.5% baseline, +8–11 pts across seeds, null-controlled `z≈4`) but not consonants (null)
3 — Linear B → Linear A transfer	Do shared AB signs carry their Linear B vowels in Linear A?	No. The transfer lift sits inside the null (`\|z\|<2`) → Linear A does not distributionally mirror Linear B — consistent with a different (non-Greek) language

One method, two questions, one honest picture (shared axis). Left: it detects Linear B's own vowel structure — the real lift sits far outside the degree-preserving null. Right: it finds no A→B transfer — the real lift is buried in the null. We measure where the signal is and where it isn't.

Full methods + results: docs/ADR-0002. The long-term vision (a SUBSTRATE-style lab for ancient scripts): docs/ADR-0003.

How it works (each language where it fits)

component	language	role
`seshat-analysis/`	Python (PyTorch)	info-limit, GNN sign embeddings, Linear B validation, null-model controls, transfer probe, figures
`seshat-core/`	Rust	corpus parser, sign inventory, bigram matrices — the data foundation
`seshat-anneal/`	C++/CUDA	QUBO / simulated + quantum-inspired annealing engine — a future refinement layer (Phase 4), validated on synthetic data only, not a Linear A decipherment claim
`seshat-viz/`	Rust/egui	interactive sign tables and heatmaps

Linear A corpus (Younger DB)  ──►  Phase 1: information limit (entropy, Zipf, hapax)
sign co-occurrence graph      ──►  Phase 2: GNN embeddings (skip-gram, message-passing)
                              ──►           validate on Linear B (vowel/consonant probe + null control)
                              ──►  Phase 3: transfer probe to shared signs (honest negative)
[Phase 4 — future]            ──►  annealing refinement, seeded by the above

Reproduce

cd seshat-analysis && pip install -e .
python -m seshat_analysis.gnn_validate     --data ../data   # Phase 2 (Linear B recovery)
python -m seshat_analysis.gnn_nullmodel    --data ../data   # null-model control
python -m seshat_analysis.linear_a_transfer --data ../data  # Phase 3 (transfer — the negative)
python -m seshat_analysis.phase23_figure   --data ../data --recompute   # the figures
pytest                                                      # 11 tests

CPU is sufficient (the graphs are ~50 signs); everything is seeded and deterministic.

Data & provenance

Linear A: John Younger's Linear A Database (Univ. of Kansas) — data/corpus/linear_a/
Linear B: attested words (Ventris & Chadwick 1953; Duhoux & Morpurgo Davies) — data/corpus/linear_b/
Linear B sign values: the standard Ventris grid, read from the authoritative Unicode Linear B Syllabary character names — not hand-typed; the exact readings used are saved in data/linear_b_grid_used.json
Comparanda: Luwian, Hurrian — data/corpus/

Honesty contract

Linear A is not deciphered here; no phonetic value is asserted for any undeciphered sign. Deciphered scripts (Linear B) are ground truth; undeciphered Linear A is treated as measurement, never announcement. Methods are trusted only after they recover known structure on Linear B and survive a null-model control.

Bigger picture

SESHAT is the Aegean module of a planned computational-epigraphy lab — a SUBSTRATE-style set of per-script tools: deciphered scripts (Akkadian, Sumerian, Egyptian…) get real tooling (OCR, transliteration, search); undeciphered ones get honest limit-analysis, like here. That platform is a documented north star (docs/ADR-0003), deliberately not yet started — one honest module at a time.

Author

Antonio Zambudio Rodriguez

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
arxiv_submission		arxiv_submission
data		data
docs		docs
figures		figures
paper		paper
scripts		scripts
seshat-analysis		seshat-analysis
seshat-anneal		seshat-anneal
seshat-core		seshat-core
seshat-viz		seshat-viz
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
candidate_distributions.json		candidate_distributions.json
compile_paper.sh		compile_paper.sh
qubo_real.json		qubo_real.json
result_qa_gorila_quick.json		result_qa_gorila_quick.json
result_qa_real.json		result_qa_real.json
result_qa_trace.json		result_qa_trace.json
result_sa.json		result_sa.json
result_sa_gorila.json		result_sa_gorila.json
result_sa_gorila_quick.json		result_sa_gorila_quick.json
result_sa_real.json		result_sa_real.json
result_sa_trace.json		result_sa_trace.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SESHAT — Measuring the Information Limit of Linear A

Results (the honest headline)

How it works (each language where it fits)

Reproduce

Data & provenance

Honesty contract

Bigger picture

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SESHAT — Measuring the Information Limit of Linear A

Results (the honest headline)

How it works (each language where it fits)

Reproduce

Data & provenance

Honesty contract

Bigger picture

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages