Upload Minimal Dataset

If you don’t have production traces but can provide a small set of labeled examples for your task, you can prepare a minimal structured dataset. You only need a few dozen high-quality examples that capture the essence of your task.

Your data directory needs the following files:

File	Format	Required	Description
`job_description.json`	JSON	Yes	Task description defining what the model should do
`train.csv`	CSV or JSONL	Yes	20+ labeled (question, answer) pairs
`test.csv`	CSV or JSONL	Yes	Held-out evaluation set
`config.yaml`	YAML	Yes	Training hyperparameters
`unstructured.csv`	CSV or JSONL	No	Domain-relevant text for synthetic data generation

distil model upload-data <model-id> --data ./data

For detailed formatting and structure requirements per task type, refer to:

Question Answering data preparation →
Classification data preparation →
Tool Calling data preparation →
Multi-Turn Tool Calling data preparation →
Open Book QA (RAG) data preparation →
Closed-Book QA data preparation →
Creating the configuration file →

Upload Minimal Dataset

Cookie preferences