Skip to main content

When to Use the Python API

The API gives you full programmatic control for building custom applications.

Best For

  • Custom applications - Build your own tools
  • Complex workflows - Multi-step pipelines
  • Dynamic configuration - Adjust on the fly
  • Integration - Connect with existing code
  • Production systems - Deploy as services

What It Looks Like

Write Python code:
from aitraining import TextClassification

trainer = TextClassification(
    model="bert-base-uncased",
    learning_rate=2e-5
)

trainer.train(data)
predictions = trainer.predict(texts)

Workflow Example

import pandas as pd
from aitraining import AutoTrainer

# Custom preprocessing
data = pd.read_csv("raw_data.csv")
data = clean_and_prepare(data)

# Dynamic configuration
config = {
    "model": get_best_model(data),
    "batch_size": calculate_batch_size(),
    "epochs": 5 if len(data) > 1000 else 10
}

# Train with callbacks
trainer = AutoTrainer(**config)
trainer.train(
    data,
    callbacks=[
        early_stopping,
        checkpoint_best,
        log_to_wandb
    ]
)

# Integrate into application
@app.route("/predict")
def predict():
    result = trainer.predict(request.json)
    return jsonify(result)

Advantages

  • Full control - Access everything
  • Custom logic - Your preprocessing
  • Integration - Works with any Python library
  • Dynamic - Adjust based on conditions
  • Testable - Unit test your training

Limitations

  • More code - You write the orchestration
  • Complexity - Handle errors yourself
  • Python only - Not language agnostic
  • Dependencies - Manage packages

When to Switch

Use CLI when you:
  • Need simple automation
  • Want language agnostic solution
  • Prefer configuration over code
  • Work with non-Python tools
Use UI when you:
  • Need visual feedback
  • Teaching others
  • Quick experiments
  • Data exploration

Common Use Cases

Web Service

from flask import Flask
from aitraining import load_model

app = Flask(__name__)
model = load_model("./trained_model")

@app.route("/api/classify", methods=["POST"])
def classify():
    text = request.json["text"]
    result = model.predict(text)
    return {"label": result}

Data Pipeline

def training_pipeline(df):
    # Custom cleaning
    df = remove_outliers(df)
    df = normalize_features(df)

    # Conditional training
    if df.shape[0] > 10000:
        model = "large-model"
    else:
        model = "small-model"

    # Train
    trainer = AutoTrainer(model=model)
    trainer.train(df)

    return trainer

A/B Testing

models = {}

# Train variants
for config in experiments:
    trainer = create_trainer(config)
    trainer.train(data)
    models[config.name] = trainer

# Compare
results = evaluate_all(models, test_data)
best = select_best(results)

Custom Callbacks

class CustomCallback:
    def on_epoch_end(self, epoch, logs):
        if logs["loss"] < self.threshold:
            send_notification("Training going well!")

        if should_adjust_lr(logs):
            self.trainer.learning_rate *= 0.5

trainer.train(data, callbacks=[CustomCallback()])

Tips for API Users

  1. Handle exceptions - Training can fail
  2. Add logging - Track what happens
  3. Use type hints - Catch errors early
  4. Write tests - Ensure reliability
  5. Document code - Others will use it

API-Exclusive Features

Things only the API can do:
  • Custom callbacks during training
  • Dynamic model selection
  • Complex data pipelines
  • Embedded in applications
  • Programmatic hyperparameter tuning

Essential Patterns

# Context manager for resources
with AITraining() as trainer:
    trainer.train(data)
    # Automatically cleanup

# Async training
async def train_async():
    await trainer.train_async(data)

# Streaming predictions
for prediction in trainer.predict_stream(texts):
    process(prediction)

# Model composition
ensemble = Ensemble([
    model1,
    model2,
    model3
])

Integration Examples

# With pandas
df = pd.read_csv("data.csv")
trainer.train(df)

# With scikit-learn
from sklearn.model_selection import train_test_split
X_train, X_test = train_test_split(data)

# With weights & biases
import wandb
wandb.init(project="my-training")
trainer.train(data, callbacks=[WandbCallback()])

# With FastAPI
@app.post("/train")
async def train_endpoint(data: TrainingData):
    result = await trainer.train_async(data)
    return {"model_id": result.id}

Next Steps