Skip to content

jkaus324/engineering-systems

Repository files navigation

πŸ”© Under the Hood

What's really happening inside the systems you use every day β€” reconstructed, and rebuilt small enough to run.

Most "how X works" posts stop at a box diagram. These don't. Each teardown goes down to the algorithms, data structures, and the exact tradeoffs β€” reconstructed from a company's public engineering blog β€” and ships a small runnable prototype you can read in one sitting.


Stars License: MIT PRs Welcome

Made with Java Built with Jekyll Honest reconstruction


πŸ“– Read the blog Β Β·Β  πŸ—ΊοΈ The series arc Β Β·Β  βž• Suggest a teardown Β Β·Β  πŸš€ Run the code

⭐ New teardowns land here regularly β€” star the repo to follow along.


πŸ’‘ The idea in 5 seconds

You use these systems every day. This repo opens them up β€” one system at a time β€” and rebuilds the core mechanism small enough that you can read it, run it, and break it in an afternoon.

Every teardown does the same four things:

  • 🧩 Starts from the problem and builds the architecture up one decision at a time.
  • πŸ”¬ Names the algorithm underneath β€” inverted index, scatter-gather, fan-out β€” and explains why it's efficient.
  • πŸ’» Ships a runnable prototype β€” not a production clone, just the core idea you can execute.
  • 🧾 Stays honest β€” inferred from public sources, always cited, never claiming to be anyone's real code.

πŸ“š The teardowns

# System Company The core idea πŸ“– Read πŸ’» Run
01 FishDB β€” feed retrieval engine LinkedIn Graph-anchored retrieval: an inverted index + scatter-gather over 48 shards. You post, nothing moves; readers pull it at query time. Post Β· Deep dive fishdb/
02 Twitter timelines Twitter / X Fan-out-on-write & the celebrity problem planned β†’ roadmap β€”
03 Instagram Explore Meta The blend, and ranking as the center of gravity planned β€”
04 Who pays the cost? β€” The synthesis: write-side vs. read-side as one design axis planned β€”

🧭 The series is built to climb toward Teardown 04 β€” a synthesis where the individual systems stop being the subject and become evidence for one mental model. The full arc is in SERIES.md.


⭐ Featured Β· Teardown 01 β€” LinkedIn FishDB

The hook: YouTube optimizes watch time. Instagram optimizes discovery. LinkedIn optimizes graph relevance β€” and that one difference changes the entire engineering. When you post, nothing is sent anywhere; it's filed under your name, and readers pull it when they open the app.

What you'll learn (click to expand)
  • Inverted vs. forward indexes β€” and why you need both
  • The timeline record (actor, verb, object) β€” and why the actor becomes a search term
  • Scatter-gather over shards + replicas, and why the slowest shard sets your latency
  • One-writer-per-shard, the rule that buys lock-free reads
  • Lambda ingestion (Kafka for fresh + HDFS for bulk)
  • Per-shard top-K with a heap, and the broker's k-way merge
πŸ“– Blog post How LinkedIn Built Its Feed to Answer You in 40 Milliseconds
🧠 Deep-dive design doc teardowns/fishdb/DESIGN.md β€” problem β†’ requirements β†’ HLD β†’ flows β†’ rationale
πŸ’» Runnable prototype (Java) teardowns/fishdb/code/
cd teardowns/fishdb/code
javac fishdb/*.java
java fishdb.Main        # scripted walkthrough β€” one query, narrated end to end
java fishdb.Main -i     # interactive β€” type your own queries and watch retrieval happen live

Source: LinkedIn Engineering β€” FishDB. A reconstruction, not LinkedIn's code.


πŸͺž One repo, two faces

This repo is a blog and a browsable code resource at the same time β€” pick whichever way you like to learn.

flowchart LR
    R["πŸ”© Under the Hood<br/>(this repo)"]
    R --> B["πŸ“– The blog<br/>_posts/ β†’ GitHub Pages<br/><i>read the narrative</i>"]
    R --> C["πŸ’» The code resource<br/>teardowns/&lt;name&gt;/<br/><i>read &amp; run the prototype</i>"]
    B -. same teardown .- C
Loading
If you want to… Go to…
πŸ“– Read the story of how a system works the blog (rendered from _posts/)
🧠 Go deeper β€” requirements, HLD, sequence diagrams the teardown's DESIGN.md
πŸ’» Run the mechanism yourself the teardown's code/ folder

Adding a teardown means adding one post + one teardowns/<name>/ folder β€” nothing else moves.

πŸ“ Full repo layout (click to expand)
engineering-systems/
β”œβ”€β”€ README.md              ← you are here (the hub)
β”œβ”€β”€ SERIES.md              ← the planned multi-post arc
β”œβ”€β”€ NEW_TEARDOWN.md        ← template + checklist for the next teardown
β”œβ”€β”€ CONTRIBUTING.md        ← hosting setup + how to contribute
β”‚
β”œβ”€β”€ _posts/                ← published blog posts (one per teardown)   ─┐
β”œβ”€β”€ _config.yml            ← Chirpy theme config                        β”‚  the
β”œβ”€β”€ Gemfile                ← Jekyll / Chirpy dependencies               β”œβ”€ GitHub Pages
β”œβ”€β”€ _tabs/                 ← blog nav pages (About, …)                  β”‚  blog
β”œβ”€β”€ .github/workflows/     ← auto-build & deploy on push                β”€β”˜
β”‚
└── teardowns/             ← one self-contained folder per teardown     ─┐
    └── fishdb/                                                          β”‚  the
        β”œβ”€β”€ README.md      ← this teardown's local hub                  β”œβ”€ code
        β”œβ”€β”€ DESIGN.md      ← the deep-dive design doc                   β”‚  resource
        └── code/          ← runnable prototype                         β”€β”˜

πŸš€ Run any prototype in 30 seconds

Each teardown's code/ folder is self-contained β€” no shared build system, no monorepo tooling. Clone, cd, run.

git clone https://github.com/jkaus324/engineering-systems.git
cd engineering-systems/teardowns/fishdb/code
javac fishdb/*.java && java fishdb.Main -i

FishDB's prototype needs JDK 21+ (it uses records and virtual threads) and zero dependencies. Future teardowns each state their own language + version in their folder's README.


πŸ—ΊοΈ Roadmap

flowchart LR
    T1["βœ… 01 Β· LinkedIn<br/>read-side retrieval"] --> T2["⏳ 02 Β· Twitter<br/>fan-out-on-write"]
    T2 --> T3["⏳ 03 · Instagram<br/>the blend"]
    T3 --> T4["🎯 04 · Synthesis<br/>who pays the cost?"]
Loading

Each post stands alone and becomes a data point for the finale β€” where push-vs-pull turns out to be a single design axis you can use to reason about any feed. Full reasoning in SERIES.md.


βš–οΈ The honesty rule

These are educational reconstructions inferred from public engineering blogs. They are:

  • βœ… Cited β€” every quantitative claim traces back to its public source.
  • βœ… Flagged β€” anything inferred beyond the source is marked inline.
  • ❌ Not affiliated with, endorsed by, or containing source code from any company discussed.

If a detail here disagrees with the original source, trust the source β€” and please open an issue so it can be fixed.


🀝 Contributing

Corrections, and suggestions for systems to tear down next, are very welcome.

  • πŸ› Found an error? Open an issue β€” trust the source over the reconstruction.
  • βž• Want to add a teardown? It's one post + one folder β€” see NEW_TEARDOWN.md.
  • 🌐 Hosting / setup? The one-time GitHub Pages steps are in CONTRIBUTING.md.

All trademarks belong to their respective owners. Prose & code in this repo: MIT β€” see LICENSE.

Built with curiosity. If a teardown taught you something, a ⭐ helps others find it.

About

Code-level teardowns of how real companies build their systems - reconstructed from public engineering blogs, each with a runnable prototype. First up: LinkedIn's feed engine, FishDB.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors