01. Introduction

Forge is a training framework that uses YAML configs to manage different model training experiments.

Detail

Forge is a YAML-driven training framework, so experiments can be modified by changing configuration files rather than rewriting training code.
It is built on the Hugging Face stack, using Transformers, Hugging Face Trainer, and TRL for training workflows.
- Transformers = load models, tokenizers, and processors
- Hugging Face Trainer = run the standard supervised training paths, including SFT and KD
- TRL = run the current RL-style training path, currently GRPO
supports multiple training paradigms:
- supervised fine-tuning
- knowledge distillation
- reinforcement-learning-based post-training, currently via the GRPO path
provides experiment infrastructure:
- W&B logging
- evaluation
- checkpointing
- resume-from-checkpoint
- SLURM auto-resubmission for long-running jobs