Loading Models
The chat interface can load models from local paths or Hugging Face.Loading a Local Model
After training with AITraining, your model is saved locally. To load it:- Find your model path (e.g.,
./my-project/) - Enter the path in the model selector
- Click “Load Model”
What to Look For
Your trained model directory should contain:config.json- Model configurationmodel.safetensorsorpytorch_model.bin- Model weightstokenizer.jsonand related tokenizer files
Loading from Hugging Face
Load any compatible model from the Hugging Face Hub:meta-llama/Llama-3.2-1B- Small, fast Llamamistralai/Mistral-7B-v0.1- Efficient 7B modelgoogle/gemma-2b- Google’s Gemma
Large models require significant GPU memory. A 7B model needs ~14GB VRAM.
Loading LoRA Adapters
PEFT/LoRA models are automatically detected and loaded correctly. Simply provide the path to your adapter directory:- Detects the
adapter_config.jsonfile - Loads the base model specified in the adapter config
- Applies the LoRA adapters
Memory Requirements
| Model Size | Approximate VRAM |
|---|---|
| 1B | ~2GB |
| 3B | ~6GB |
| 7B | ~14GB |
| 13B | ~26GB |
Switching Models
To switch to a different model:- Enter new model path
- Click “Load Model”
- Previous model is unloaded
Troubleshooting
Model not found
Model not found
Check:
- Path is correct and exists
- For HuggingFace models, check the model ID
- Ensure you have access (some models require authentication)
Out of memory
Out of memory
Try:
- Smaller model
- Quantized version
- Close other GPU-using applications
Slow loading
Slow loading
First load downloads model weights. Subsequent loads are faster.
Large models (7B+) take 30-60 seconds to load.