SFT (Supervised Fine-Tuning)

SFT (Supervised Fine-Tuning) means continuing to train a pretrained model on labeled task data.

Core idea

Start with a pretrained model.
Use labeled input-output pairs for a target task.
Optimize a supervised objective (for classification, usually cross-entropy).

Objective

Transfer general language knowledge into task-specific behavior.
Improve task metrics with relatively small additional training.

Typical training signal

Input: task examples.
Target: human-provided labels.
Loss: direct comparison between model prediction and label.

Generic workflow

pretrained model
  -> task dataset (labeled)
  -> supervised optimization
  -> fine-tuned model for the target task

In this project (MNLI)

Base model: DistilBERT model.
Task: MNLI sentence-pair classification.
Input: premise + hypothesis.
Labels: entailment, neutral, contradiction.
Output: a 3-way classifier tuned for NLI behavior.

Why SFT is useful

Simple and stable optimization.
Clear objective and evaluation.
Strong baseline before trying more advanced paradigms.

Limitations

Limited by label quality and label coverage.
Can overfit narrow data distributions.
Usually does not encode preference-style behavior by itself.

When to use

You have reliable labeled data.
The task objective is explicit and measurable.
You need a strong, fast baseline.