Loading Models

The chat interface can load models from local paths or Hugging Face.

Loading a Local Model

After training with AITraining, your model is saved locally. To load it:

Find your model path (e.g., ./my-project/)
Enter the path in the model selector
Click “Load Model”

Model path: ./my-project

What to Look For

Your trained model directory should contain:

config.json - Model configuration
model.safetensors or pytorch_model.bin - Model weights
tokenizer.json and related tokenizer files

Loading from Hugging Face

Load any compatible model from the Hugging Face Hub:

Model path: meta-llama/Llama-3.2-1B

Popular models:

meta-llama/Llama-3.2-1B - Small, fast Llama
mistralai/Mistral-7B-v0.1 - Efficient 7B model
google/gemma-2b - Google’s Gemma

Large models require significant GPU memory. A 7B model needs ~14GB VRAM.

Loading LoRA Adapters

PEFT/LoRA models are automatically detected and loaded correctly. Simply provide the path to your adapter directory:

Model path: ./my-lora-model

The chat interface automatically:

Detects the adapter_config.json file
Loads the base model specified in the adapter config
Applies the LoRA adapters

If you trained with --merge-adapter (the default), your model is already merged and loads like any standard model.

Memory Requirements

Model Size	Approximate VRAM
1B	~2GB
3B	~6GB
7B	~14GB
13B	~26GB

Use quantized models (int4/int8) to reduce memory by 2-4x.

Switching Models

To switch to a different model:

Enter new model path
Click “Load Model”
Previous model is unloaded

Note: Conversation history clears when switching models.

Troubleshooting

Model not found

Check:

Path is correct and exists
For HuggingFace models, check the model ID
Ensure you have access (some models require authentication)

Out of memory

Try:

Smaller model
Quantized version
Close other GPU-using applications

Slow loading

First load downloads model weights. Subsequent loads are faster. Large models (7B+) take 30-60 seconds to load.

Getting Started

Using Chat

Loading Models

Loading Models

Loading a Local Model

What to Look For

Loading from Hugging Face

Loading LoRA Adapters

Memory Requirements

Switching Models

Troubleshooting

Next Steps

Conversation

Parameters

Getting Started

Using Chat

​Loading Models

​Loading a Local Model

​What to Look For

​Loading from Hugging Face

​Loading LoRA Adapters

​Memory Requirements

​Switching Models

​Troubleshooting

​Next Steps

Conversation

Parameters

Loading Models

Loading a Local Model

What to Look For

Loading from Hugging Face

Loading LoRA Adapters

Memory Requirements

Switching Models

Troubleshooting

Next Steps