Hyperparameters
Hyperparameters control how your model learns. Think of them as the settings on your training.The Essential Three
Learning Rate
How big the steps are when updating the model.- Too high (0.01): Model jumps around, never converges
- Too low (0.00001): Takes forever to train
- Just right (0.00002): Steady improvement
- Fine-tuning: 2e-5 to 5e-5
- Training from scratch: 1e-4 to 1e-3
Batch Size
How many examples to process before updating weights.- Small (8): More updates, less stable, needs less memory
- Large (128): Fewer updates, more stable, needs more memory
- Limited GPU: 8-16
- Good GPU: 32-64
- Multiple GPUs: 128+
Epochs
How many times to go through your entire dataset.- Too few (1): Underfitting, model hasn’t learned enough
- Too many (100): Overfitting, memorized training data
- Just right (3-10): Good balance
Secondary Settings
Warmup Steps
Gradually increase learning rate at the start.Weight Decay
Regularization that prevents weights from getting too large.- Default: 0.0 (for LLM fine-tuning)
- No regularization: 0
- Strong regularization: 0.1
Gradient Accumulation
Simulate larger batches on limited hardware.Task-Specific Defaults
Text Classification
Language Model Fine-tuning
Image Classification
When to Adjust
Learning rate too high?- Loss explodes or becomes NaN
- Accuracy jumps around wildly
- Never converges
- Loss barely decreases
- Training takes forever
- Stuck at poor performance
- Out of memory → reduce batch size
- Training unstable → increase batch size
- Use gradient accumulation if memory limited
Quick Start Values
Not sure where to start? Try these:Evaluation Settings
Control when and how your model is evaluated during training:| Parameter | Description | Default |
|---|---|---|
eval_strategy | When to evaluate (epoch, steps, no) | epoch |
eval_batch_size | Batch size for evaluation | 8 |
use_enhanced_eval | Enable advanced metrics (BLEU, ROUGE, etc.) | False |
eval_metrics | Metrics to compute (comma-separated) | perplexity |
eval_save_predictions | Save model predictions | False |
eval_benchmark | Run standard benchmark (mmlu, hellaswag, arc, truthfulqa) | None |
Pro Tips
- Start with defaults - Don’t overthink initially
- Change one at a time - Easier to see what helps
- Log everything - Track what works for your data
- Use validation set - Monitor overfitting