Skip to content

NathanTheAsian/Choom

Repository files navigation

Choom

A project for training handwritten Chinese character recognition models from .gnt handwriting datasets.

Features

  • Convert .gnt files into grayscale PNG character images
  • Load generated PNG files directly as training data
  • Apply default preprocessing during training: resize, float scaling, and normalization
  • Train a CNN classifier and save a checkpoint with class mappings

Initialization

Prerequisites

  • Python 3.10 or newer
  • A virtual environment is recommended

Local Environment Setup

Windows PowerShell:

python -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install torch torchvision numpy pillow

macOS or Linux:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install torch torchvision numpy pillow

The Docker files are scaffolded, but the current container setup expects a populated requirements.txt. If you want to use Docker, update that file first.

Dataset Preparation

  1. Place your raw .gnt files in data/raw/.
  2. Run the converter on a directory or a specific .gnt file.

Convert every .gnt file in data/raw/ into the default output directory:

python scripts/convert_gnt.py data/raw

Convert a single file:

python scripts/convert_gnt.py data/raw/001-f.gnt

Write images and the manifest to a custom directory:

python scripts/convert_gnt.py data/raw --output-dir .\converted

Useful flags:

  • --output-dir to choose where PNG files are written
  • --manifest-path to choose where samples.txt is written
  • --glob to control which files are matched when the input is a directory
  • --append-manifest to append metadata instead of replacing it
  • --verbose to print one line per converted sample
  1. The converter writes PNG files and metadata to the output directory used by the training pipeline.

Current output locations:

  • output/ by default
  • output/samples.txt by default

The training loader scans output/ first and choom/output/ as a fallback.

For meaningful training, convert multiple .gnt files. A single file can leave you with only one sample per class, which is not enough for useful validation.

Optional follow-up scripts:

  • python scripts/preprocess.py prints a quick summary of the generated PNG dataset.
  • python scripts/augment.py --copies 2 creates an output_augmented/ directory with originals plus augmented PNGs.
  • Train on augmented data with python -m choom.training.train --img-dir .\output_augmented.

Usage Process

Train The Model

python -m choom.training.train --epochs 10 --batch-size 64 --device cpu

If you have CUDA available, use:

python -m choom.training.train --epochs 10 --batch-size 64 --device cuda

Useful flags:

  • --img-dir to point at a specific generated image directory
  • --epochs to control training length
  • --batch-size to control memory usage
  • --validation-split to reserve part of the dataset for validation
  • --output-path to choose where the checkpoint is written
  • --num-workers to increase DataLoader throughput

If each class only has one sample, use --validation-split 0. Once you have multiple samples per class, a value like 0.2 is reasonable.

What The Training Command Does

  1. Scans the generated PNG files in output/ or choom/output/.
  2. Builds labels from each filename prefix, such as b0a1.
  3. Applies default image transforms:
  4. Resizes each image to 64x64.
  5. Converts the image to float32 and scales pixel values to [0, 1].
  6. Normalizes the image with mean 0.5 and standard deviation 0.5.
  7. Creates the CharacterCNN model and trains it with CrossEntropyLoss and AdamW.
  8. Saves a checkpoint to choom/output/character_cnn.pt unless you override the path.

Example Commands

One quick smoke test on CPU:

python -m choom.training.train --epochs 1 --batch-size 256 --validation-split 0 --device cpu

A longer run after converting more .gnt files:

python -m choom.training.train --epochs 20 --batch-size 64 --validation-split 0.2 --device cuda

Outputs

Training creates or updates these artifacts:

  • Generated PNG samples in output/ by default
  • Sample metadata in output/samples.txt by default
  • Model checkpoint in choom/output/character_cnn.pt

License

MIT License

About

Chinese Handwriting Model (CHM) aka CHooM. Recognizes Chinese handwriting.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors