Training Your First LLM with SFT
This walkthrough takes you through every step of the wizard to train a language model using Supervised Fine-Tuning (SFT). SFT is the most common way to teach a model to follow instructions.Before You Start
Make sure you have:- AITraining installed (
pip install aitraining) - At least 8GB of RAM (16GB recommended)
- A GPU is helpful but not required (Apple Silicon works great!)
Step 0: Launch the Wizard
Step 1: Choose Trainer Type
1 and press Enter to select LLM training.
Step 2: Choose Training Method
1 and press Enter to select SFT.
default and sft are identical - they use the same training code. default is just the fallback if no trainer is specified.What Do These Mean?
| Trainer | When to Use |
|---|---|
| SFT / default | Teaching the model to follow instructions. You have examples of good responses. Start here! |
| DPO | You have pairs of good vs bad responses for the same prompt |
| ORPO | Like DPO but works with less data |
| PPO | Advanced: using a reward model to score responses |
| Reward | Train a reward model for scoring outputs (used with PPO) |
| Distillation | Transfer knowledge from a larger teacher model to a smaller student |
Step 3: Project Name
my-first-chatbot or press Enter to accept the default.
Step 4: Model Selection
This is the most important step. The wizard shows trending models from HuggingFace:Choosing the Right Model Size
I have a MacBook (8-16GB RAM)
I have a MacBook (8-16GB RAM)
Use
/filter then S for small models.Recommended: google/gemma-3-270m or meta-llama/Llama-3.2-1BThese will train in 15-30 minutes on Apple Silicon.I have a gaming PC (RTX 3060/3070, 8-12GB VRAM)
I have a gaming PC (RTX 3060/3070, 8-12GB VRAM)
Use
/filter then S or M.Recommended: google/gemma-2-2b or meta-llama/Llama-3.2-3BEnable quantization later for larger models.I have a workstation (RTX 3090/4090, 24GB+ VRAM)
I have a workstation (RTX 3090/4090, 24GB+ VRAM)
Any model up to 10B works well.Recommended:
meta-llama/Llama-3.2-8B or mistralai/Mistral-7B-v0.3I have a cloud GPU (A100, H100)
I have a cloud GPU (A100, H100)
Go big!Recommended:
meta-llama/Llama-3.1-70B with quantizationBase Model vs Instruction-Tuned
When selecting a model, you’ll see two types:| Model Name | Type | When to Use |
|---|---|---|
google/gemma-2-2b | Base (pretrained) | General purpose, learns your specific style |
google/gemma-2-2b-it | Instruction-tuned (IT) | Already follows instructions, fine-tune further |
meta-llama/Llama-3.2-1B | Base | Clean slate for your use case |
meta-llama/Llama-3.2-1B-Instruct | Instruction-tuned | Already helpful, refine it |
Rule of thumb: Use base models if you want full control. Use instruction-tuned (
-it, -Instruct) if you want a head start.Selecting Your Model
Option A: Type a number to select from the list:Step 5: Dataset Configuration
Understanding Dataset Size
Dataset Selection Options
Use a pre-built dataset (easiest):Dataset Format Analysis
The wizard automatically analyzes your dataset:y to enable automatic conversion. This ensures your data works correctly with the model’s chat template.
Train/Validation Splits
train split.
validation, test), enter it here. Otherwise, press Enter to skip.
Max Samples (Testing)
Step 6: Advanced Configuration (Optional)
When to Configure Advanced Options
| Situation | What to Change |
|---|---|
| Training is too slow | Enable LoRA (peft=True) to reduce memory |
| Out of memory | Reduce batch_size or enable quantization |
| Model isn’t learning | Adjust lr (learning rate) |
| Want to track training | Enable W&B logging |
Step 7: Review and Start
What Happens Next
- The model downloads (first time only)
- The dataset loads and converts
- Training begins with progress updates
- W&B LEET panel shows real-time metrics (if enabled)
- Your trained model saves to the project folder
Testing Your Model
After training completes:http://localhost:7860/inference and load your model from ./my-first-chatbot to test it!
Common Issues
Out of memory error
Out of memory error
- Use a smaller model (filter by size)
- Enable LoRA in advanced options
- Reduce batch size
- Enable quantization (
int4)
Model not learning (loss stays high)
Model not learning (loss stays high)
- Check your dataset format
- Try a higher learning rate
- Ensure your data has the right columns
Training is very slow
Training is very slow
- Enable mixed precision (
bf16) in advanced options - Use a smaller dataset first
- Enable LoRA