FAQ

Don’t see your question?
Write us a message at:

What is a Teacher model?

Plus icon – click on it to expand or minimise a question

It is the model used to generate additional training data we use for fine-tuning. We use best-in-class open-source models as our “Teacher” models. Currently LLaMA 3.1 405B.

What is a Student model?

This is the pre-trained language model that we specialize for your needs. The model we use varies depending on the use case – we find the smallest possible model that meets your quality threshold.

How much data do I need to train a model?

The more examples the better but we usually recommend using a few dozen diverse examples for the specialized model to do well on a variety of inputs.

I have a larger dataset – does it make sense to upload more than 1000 examples?

Uploading more samples is always better as long as there are no duplicates.

How should I write the “task description”?

You should write the task description in the same way you would write a LLM prompt to solve your particular task. We have found that task descriptions used as an in-context learning prompts usually work best on our platform.

How do you evaluate the model?

We isolate a test set from your training set to evaluate the model using task-appropriate metrics – for example accuracy and F1 score for classification.

Will the trained model be better than GPT4?

We aim for the trained models to have equal or higher accuracy than modern large language models such as Llama405b, GPT4, or else.

Which model sizes do you train?

We can train models up to 8B parameters as student models.

What is the latency of a SLM you fine-tune?

This largely depends on the size of the specialized model you train. Please reach out to us for more details.

Which use cases do you solve?

We support the most common NLP tasks such as classification, Named Entity Recognition, extractive QA, summarization, function calling and ranking. Reach out to us or book a demo to learn more and let us know if you would like us to expand to new use cases.

FAQ