Skip to content

Mungeryang/CS336-From-Scratch-Spring2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀🚀 CS336-From-Scratch Spring 2026🚀🚀

The NoteBook and Assignments implemention via Learning CS336 Spring 2026!

“And in the end, the love you take is equal to the love you make.” —— The Beatles

I'm mungeryang, a master's student from the University of Chinese Academy of Sciences(UCAS-iie). In this Repo, I have open-sourced all of my study notes, implementation details for assignments, and results.

Class HomePage:https://cs336.stanford.edu/

图解大模型架构参见:Hand-Drawn-LLM

cs336

🧐50 $QA_{s}$ About LLM - 大模型面试50问

通过整理 Standford CS336 Spring26 课堂笔记,总结大模型算法经典面试50

参考资料: CS336课堂笔记、李博杰老师 - 大模型面试题 200 问百面大模型大模型技术30讲

⭐️⭐️⭐️ 50问整理

P.S. 由于精力和能力有限,仅整理出我本人认为较为经典的50问。问题与作答全部开源,欢迎任何感兴趣的人fork更新~ 👏👏👏

💻 Assignments

Assignment 1: Basics

  • Implement all of the components (tokenizer, model architecture, optimizer) necessary to train a standard Transformer language model
  • Train a minimal language model
Assignment1 Status Link
train_bpe BPE Implementation
BPETokenizer Tiny_BPETokenizer Class Implementation
Linear Linear Class
Emebdding EMbedding Class
RMSNorm RMSNorm
Swiglu SwiGLU FFN
RoPE RoPE Class
softmax softmax funcion
attention Scaled_Dot_Attn
mul-attn MultiHeadAttn Class
LM block Transformer Block
cross-entropy train function
AdamW Adamw optimizer

⌨️ Assignment 1: Results

Assignment 2: Systems

  • Profile and benchmark the model and layers from Assignment 1 using advanced tools, optimize Attention with your own Triton implementation of FlashAttention2

  • Build a memory-efficient, distributed version of the Assignment 1 model training code

Assignment2 Status Link

Assignment 3: Scaling

  • Understand the function of each component of the Transformer
  • Query a training API to fit a scaling law to project model scaling

Assignment 4: Data

  • Convert raw Common Crawl dumps into usable pretraining data
  • Perform filtering and deduplication to improve model performance

Assignment 5: Alignment

  • Apply supervised finetuning and reinforcement learning to train LMs to reason when solving math problems
  • Optional Part 2: implement and apply safety alignment methods such as DPO

⭐️ Star History

Star History Chart

About

The NoteBook and Assignments implemention via Learning CS336 Spring 2026😛

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors