Factorized Masked LM Experiments

Standalone repo to compare position→token and token→position factorizations on the same DiT backbone and masking distribution. Data loading, tokenizer handling, and model configs are adapted from the mdlm codebase but live entirely in this directory.

What it does

Samples the same masked partial state for both factorizations and trains either pos_to_tok, tok_to_pos, or joint (both) objectives.
Both branches include a position head so you can compare joint log p(pos, tok | s) fairly.
Logs every metric and loss term to Weights & Biases (wandb).
Uses a DiT backbone with rotary embeddings; keep max_length ≤ DiT’s supported length.

Quickstart

cd /home/yunseok/Workspace/token_ordering/factorization
bash scripts/train.sh  # runs pos→tok then tok→pos on wikitext-2 with default settings

To customize a run:

python trainer.py \
  --dataset wikitext2 \
  --tokenizer gpt2 \
  --max_length 256 \
  --batch_size 8 \
  --max_steps 5000 \
  --factorization joint \
  --wandb_project token-ordering-factorization

Key flags:

--factorization {pos_to_tok,tok_to_pos,joint}: choose which factorization to optimize.
--disable_position_priors: turn off the p(i | s) head on the pos→tok side (enabled by default).
--mask_ratio: fraction of visible tokens replaced by [MASK] per example.
--max_length: keep at or below the DiT config length (default 512 in ModelConfig).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
data.py		data.py
masker.py		masker.py
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Factorized Masked LM Experiments

What it does

Quickstart

About

Uh oh!

Releases

Packages

Languages

YunseokHan/factorization

Folders and files

Latest commit

History

Repository files navigation

Factorized Masked LM Experiments

What it does

Quickstart

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages