Skip to content

Miaoge-Ge/facetrain-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Face Recognition Framework

PyTorch face recognition training/evaluation framework with configurable backbones and margin-based heads.

Features

  • Training, validation (pairs), and checkpointing
  • Evaluation on .bin pair datasets (e.g., LFW / CFP-FP / AgeDB-30)
  • YAML-driven configuration
  • TensorBoard logging (optional)
  • Resume training from latest run directory

Project Layout

  • train.py: training entrypoint
  • test.py: evaluation entrypoint (pairs verification)
  • engine/
    • trainer.py: training loop + validation + checkpointing
    • evaluator.py: pair verification metrics (10-fold accuracy, TAR@FAR, AUC)
    • predictor.py: two-image similarity inference helper
  • models/
    • backbones/: feature extractors
    • heads/: margin-based classification heads
    • model_factory.py: build_model(config) factory
  • data/
    • dataset.py: FaceEmoreDataset (RecordIO) + BinPairDataset (.bin)
    • transforms.py: train/val transforms
  • utils/
    • checkpoint.py: save/load checkpoints
    • logger.py: file logger + TensorBoard
    • metrics.py: verification metrics
    • common.py: reproducibility seed helper
  • config/: example configs

Installation

Python 3.9+ is recommended.

pip install -r requirements.txt

Notes:

  • mxnet is required to read InsightFace RecordIO training data (train.rec / train.idx).
  • scikit-learn is required for ROC/AUC/TAR@FAR metrics.

Data Preparation

Training set (InsightFace RecordIO)

Your dataset root (set by data.root in YAML) should contain:

  • train.rec
  • train.idx
  • property (comma-separated, at least: num_classes,height,width)

Example:

data:
  root: /data/faces_emore

Evaluation set (.bin verification pairs)

Put verification bins under the same dataset root:

  • lfw.bin
  • cfp_fp.bin
  • agedb_30.bin

Select which one to evaluate via:

eval:
  bin_file: "lfw.bin"

Training

Quick Start

python train.py --config config/train_resnet50.yaml --device cuda

Resume from the latest run directory under checkpoint.save_dir:

python train.py --config config/train_resnet50.yaml --resume

Training Configuration (YAML)

Key fields used by the trainer:

  • seed: random seed (default: 42)
  • device: runtime override supported via --device (cuda or cpu)
  • model.backbone: backbone name (resnet50 or fastcontextface)
  • model.embedding_size: embedding dimension (e.g., 512)
  • head.type: head type (arcface / cosface / adaface)
  • head.num_classes: class count (must match training dataset classes)
  • data.root: dataset root directory
  • data.batch_size, data.num_workers, data.img_size
  • training.epochs, training.optimizer, training.lr, training.weight_decay, training.momentum
  • training.scheduler: multistep or cosine
  • training.warmup_epochs
  • training.amp: enable automatic mixed precision on CUDA (default: true)
  • training.grad_clip_norm: gradient clipping norm (default: 5.0)
  • training.resume: enable resume logic (also can be set by --resume)
  • checkpoint.save_dir: checkpoint root directory
  • logging.log_dir: log directory (files + tensorboard)
  • eval.bin_file, eval.test_batch_size, eval.eval_freq

Evaluation (Verification)

Evaluate a trained checkpoint against the .bin pair dataset:

python test.py \
  --config config/train_resnet50.yaml \
  --checkpoint checkpoints/resnet50/<RUN_TIMESTAMP>/best.pth \
  --name resnet50_adaface

Artifacts:

  • Log file in logs/...
  • A summary text file test_result_<name>_<timestamp>.txt

Inference (Two-Image Similarity)

engine/predictor.py provides a simple API to compute cosine similarity between two images:

from engine.predictor import Predictor

pred = Predictor(
    config_path="config/train_resnet50.yaml",
    checkpoint_path="checkpoints/resnet50/<RUN_TIMESTAMP>/best.pth",
    use_cpu=False,
)
score = pred.predict("a.jpg", "b.jpg")
print(score)

Reproducibility

Training seeds are set by seed in YAML. Deterministic mode can be controlled by:

deterministic: true

When deterministic=true, cuDNN benchmarking is disabled to improve reproducibility.

About

A comprehensive deep learning framework for face recognition model training. Supports mainstream architectures (ArcFace, CosFace, SphereFace) with flexible data pipeline, distributed training, and easy-to-use APIs. Built for researchers and developers to accelerate face recognition model development.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages