How it works

The distil labs platform allows anyone to benefit from state-of-the-art methods when it comes to model fine-tuning. You don’t need to be a machine learning expert to get a highly performant model customized to your needs in a matter of a day.
Step 1

Upload your data

We need a task description, a few dozen examples of the task the model is supposed to perform, and any additional unstructured data about the problem.
Task description
Describe the task you expect the model to perform. If you are currently using an LLM, your prompt is a good starting point.
Training dataset
We need only a few dozen examples to fine-tune your model. Of course the more diverse and bigger the dataset, the better.
Additional unstructured data
This unstructured dataset is used to inspire the teacher model in generating diverse, domain-specific data. It can be documentation, unlabelled examples, or even industry literature that contains such information.
Step 2

We train a small model for your task

Our main USP is lifting the data requirement for training SLMs. We use pre-trained large language "teacher" models to train smaller, specialized "student" models based on your problem definition.

In practice, the teacher model generates synthetic data about your problem. This generated data is used to train the student who emulates the teacher's behavior while maintaining a smaller size.
Flowchart describing the process of distilling a small language model.

First step is uploading task description, training dataset and additional unstructured data (e.g. docs).

Second step is synthetic data generation and synthetic data validation using the teacher model.

Third step is student model training and fine-tuning.

Once these steps are complete your tuned small model is complete and ready to use.
Knowledge distillation [Hinton et al, 2015] is a method that transfers the domain-specific knowledge from the large teacher models to a small student. This approach allows us to create small, task-specific language models that achieve comparable performance to larger models without the need to annotate thousands of examples.

We generate synthetic data and use it to train the student model with a loss function that aligns with the user task. In this process, the student model learns to emulate the teacher's target skills or domain knowledge, effectively acquiring similar capabilities.
[Hinton et al, 2015]: https://arxiv.org/abs/1503.02531
Step 3

We evaluate your model

Once the student model is ready, we share accuracy benchmarks based on a test dataset we isolate from your training data.

From there you can either improve your model by iterating on the inputs (change the task description, upload more data) or deploy it.
Iterate
In case the desired accuracy hasn't been reached, you can iterate on your model to further improve it. You can provide more context by adapting the task description and/or upload more training examples in cases where the model didn't perform well.
Proceed with deployment
When the desired accuracy benchmarks have been reached you can deploy your model.
Step 4

Access the specialized model

We can share the model binaries with you, so you can deploy it on your own infrastructure. Alternatively, we can host the model for you and provide a secure API endpoint which you can query and integrate with your product.
Deploy model on own infrastructure
If you choose to self-host, we are happy to share the model weights with you, so you can have full control.
Integrate API endpoint
We can host the model for you and provide you with a secure API endpoint. This makes integrating AI into your business as easy as querying any API.

Key benefits

Best of both worlds – LLMs & conventional Machine Learning

Accelerating your development process

You need a much smaller labeled training dataset than ever before; there is no need to onboard, manage and pay for subject matter expert human annotators. Few dozen labeled examples instead of 10 000s.

Reducing latency and costs of your product

Using a smaller specialized model enables deployment on cheaper and faster infrastructure. Models fine-tuned to your use case don't need as much context in the prompt, further reducing your token usage.

Local deployment

Small models can be easily deployed directly on mobile hardware as a part of an application. This means your applications are not reliant on a strong network connection and it helps ensure data privacy compliance.

Let’s build a model for your business

Create a custom Small Language Model for your AI product