Overview

Whether you are looking to classify text, answer questions, interact with internal tools, or solve other language tasks, our step-by-step workflow will take you from initial concept to production-ready model. Let’s dive in!

Authentication

First, authenticate with the distil labs platform:

distil login

import json
import requests

# See Account and Authentication for distil_bearer_token() implementation
auth_header = {"Authorization": f"Bearer {distil_bearer_token()}"}

Step 1: Create a model

# Returns your model ID
distil model create my-model-name

import json
import requests
from pprint import pprint

response = requests.post(
    "https://api.distillabs.ai/models",
    data=json.dumps({"name": "my-model-name"}),
    headers={"Content-Type": "application/json", **auth_header},
)
pprint(response.json())
model_id = response.json()["id"]
print(f"Created model with ID={model_id}")

You can list all your models with:

distil model list

response = requests.get(
    "https://api.distillabs.ai/models",
    headers=auth_header,
)
pprint(response.json())

Step 2: Task selection and data preparation

Begin by identifying the specific task you want your model to perform. Different tasks require different approaches to data preparation and model configuration.

Learn more about task selection →

Step 2A: Upload traces

If you have production traces (logs of real interactions with an LLM), upload them to train directly from real-world usage data. This is a two-step process: traces are first uploaded as a PreparedTraces resource, then processed to produce training and test data. The trace processing pipeline will automatically filter, relabel, and split your traces into training and test data.

Your traces directory should contain:

File	Format	Required	Description
`traces.jsonl`	JSONL	Yes	Production traces in Langfuse or OpenAI messages format
`job_description.json`	JSON	Yes	Task objectives and configuration
`config.yaml`	YAML	Yes	Training and trace processing parameters

Upload your traces to the model:

distil model upload-traces <model-id> --data ./traces

import yaml

# Step 1: Create a PreparedTraces resource
# Get presigned S3 URLs for uploading trace files
response = requests.get(
    "https://api.distillabs.ai/staging-prepared-traces-s3-urls",
    headers=auth_header,
)
urls = response.json()

# Upload files to S3 using the presigned URLs
requests.put(urls["traces_jsonl"], data=open("traces/traces.jsonl").read())
requests.put(urls["job_description_json"], data=open("traces/job_description.json").read())
requests.put(urls["config_yaml"], data=open("traces/config.yaml").read())

# Register the uploaded files
response = requests.post(
    f"https://api.distillabs.ai/models/{model_id}/prepared-traces",
    data=json.dumps({
        "traces_jsonl": urls["traces_jsonl"],
        "config": urls["config_yaml"],
        "job_description_json": urls["job_description_json"],
    }),
    headers={"Content-Type": "application/json", **auth_header},
)
prepared_traces_id = response.json()["id"]
print(f"PreparedTraces created. ID: {prepared_traces_id}")

# Step 2: Kick off trace processing to produce an Upload
config = yaml.safe_load(open("traces/config.yaml"))
trace_processing_config = config.get("trace_processing", {})
response = requests.post(
    f"https://api.distillabs.ai/models/{model_id}/prepared-traces/{prepared_traces_id}/upload",
    data=json.dumps({"trace-processing-config": trace_processing_config}),
    headers={"Content-Type": "application/json", **auth_header},
)
upload_id = response.json()["id"]
print(f"Trace processing started. Upload ID: {upload_id}")

Learn more about trace processing →

Step 2B: Upload minimal dataset

If you don’t have production traces, prepare a small structured dataset with labeled examples instead. A training job requires the following files in a directory:

File	Format	Required	Description
`job_description.json`	JSON	Yes	Task objectives and configuration
`train.csv`	CSV or JSONL	Yes	20+ labeled (question, answer) pairs
`test.csv`	CSV or JSONL	Yes	Held-out evaluation set
`config.yaml`	YAML	Yes	Training hyperparameters
`unstructured.csv`	CSV or JSONL	No	Text documents relating to your problem domain which we may use for synthetic data generation

Upload your data to the model:

distil model upload-data <model-id> --data ./data

data = {
    "job_description": {"type": "json", "content": open("data/job_description.json").read()},
    "train_data": {"type": "csv", "content": open("data/train.csv").read()},
    "test_data": {"type": "csv", "content": open("data/test.csv").read()},
    "unstructured_data": {"type": "csv", "content": open("data/unstructured.csv").read()},
    "config": {"type": "yaml", "content": open("data/config.yaml").read()},
}
response = requests.post(
    f"https://api.distillabs.ai/models/{model_id}/uploads",
    data=json.dumps(data),
    headers={"Content-Type": "application/json", **auth_header},
)
upload_id = response.json()["id"]
print(f"Upload successful. ID: {upload_id}")

Learn more about data preparation →

Step 3: Teacher evaluation

Before training your specialized small model, validate whether a large language model can accurately solve your task with the provided examples. If the teacher model can solve the task, the student model will be able to learn from it effectively. Learn about teacher evaluation →

# Start teacher evaluation
distil model run-teacher-evaluation <model-id>

# Check status
distil model teacher-evaluation <model-id>

import time

data = {"upload_id": upload_id}
response = requests.post(
    f"https://api.distillabs.ai/models/{model_id}/teacher-evaluations",
    data=json.dumps(data),
    headers={"Content-Type": "application/json", **auth_header},
)
eval_job_id = response.json()["id"]
print(f"Started teacher evaluation with ID: {eval_job_id}")

# Poll for completion
running = True
while running:
    response = requests.get(
        f"https://api.distillabs.ai/teacher-evaluations/{eval_job_id}/status",
        headers=auth_header
    )
    status = response.json()["status"]
    if status != "JOB_RUNNING":
        running = False
    print(f"Evaluation status: {status}")
    time.sleep(10)

print(f"Results: {response.json()}")

Step 4: Model training

Once your teacher evaluation shows satisfactory results, train your specialized small language model using knowledge distillation.

Understand the model training process →

# Start training
distil model run-training <model-id>

# Check status
distil model training <model-id>

data = {"upload_id": upload_id}
response = requests.post(
    f"https://api.distillabs.ai/models/{model_id}/training",
    data=json.dumps(data),
    headers={"Content-Type": "application/json", **auth_header},
)
slm_training_job_id = response.json()["id"]
print(f"Training started with ID: {slm_training_job_id}")

# Check status
response = requests.get(
    f"https://api.distillabs.ai/trainings/{slm_training_job_id}/status",
    headers=auth_header
)
pprint(response.json())

# Get evaluation results when complete
response = requests.get(
    f"https://api.distillabs.ai/trainings/{slm_training_job_id}/evaluation-results",
    headers=auth_header
)
print(f"Evaluation results: {response.json()}")

Step 5: Download your model

Once training is complete, download your model:

distil model download <model-id>

response = requests.get(
    f"https://api.distillabs.ai/trainings/{slm_training_job_id}/model",
    headers=auth_header
)
print(f"Model download URL: {response.json()}")

Step 6: Model deployment

Deploy your trained model locally or using distil labs inference for immediate integration with your applications.

Explore deployment options →

Local deployment

Use the distil CLI with llama-cpp as the inference backend:

distil model deploy local <model-id>

Once running, get a ready-to-run invocation script with distil model invoke:

distil model invoke <model-id>

This outputs a command using uv that you can copy and run directly:

uv run $PATH_TO_CLIENT --question "Your question here"

For question answering models that require context, use the --context flag:

uv run $PATH_TO_CLIENT --question "Your question here" --context "Your context here"

Remote deployment

Alternatively, deploy your model on distil-managed remote infrastructure using the distil labs inference:

distil model deploy remote <model-id>

The CLI will provision your deployment and display the endpoint URL, API key, and a client script you can use to query your model.

Once deployed, you can also use distil model invoke to get a ready-to-run invocation script for your remote deployment:

distil model invoke <model-id>

Next steps

You’ve successfully trained and deployed a specialized small language model! For more details, explore:

Tutorials for complete end-to-end examples
Deployment options for production deployment