FAQ
Don’t see your question?
Write us a message at:
What is a Teacher model?
Plus icon – click on it to expand or minimise a question

It is the model used to generate additional training data we use for fine-tuning. We use best-in-class open-source models as our “Teacher” models. Currently LLaMA 3.1 405B.

What is a Student model?
Plus icon – click on it to expand or minimise a question

This is the pre-trained language model that we specialize for your needs. The model we use varies depending on the use case – we find the smallest possible model that meets the quality threshold.

How much data do I need to train a model?
Plus icon – click on it to expand or minimise a question

The more examples the better but we usually recommend using at least a hundred diverse examples for the specialized model to do well on a variety of inputs.

I have a larger dataset – does it make sense to upload more than 1000 examples?
Plus icon – click on it to expand or minimise a question

Uploading more samples is always better as long as there are no duplicates.

How should I write the “task description”?
Plus icon – click on it to expand or minimise a question

You should write the task description in the same way you would write a LLM prompt to solve your particular task. We have found that task descriptions that yield the best accuracy when used as an in-context learning prompt usually work best on our platform.

How do you evaluate the model?
Plus icon – click on it to expand or minimise a question

We use your test set or a part of your training set (in the absence of the test set) to evaluate the model using task-appropriate metrics. For example accuracy and F1 score for classification.

Will the trained model be better than GPT4?
Plus icon – click on it to expand or minimise a question

We aim for the trained models to have equal or higher accuracy than modern large language models such as Llama405b, GPT4, or else.

Which model sizes do you train?
Plus icon – click on it to expand or minimise a question

We can train models up to 8B parameters as student models.

What is the latency of a SLM you fine-tune?
Plus icon – click on it to expand or minimise a question

This largely depends on the size of the specialized model you train. Please reach out to us for details.

Which use cases do you solve?
Plus icon – click on it to expand or minimise a question

We support the most common NLP use cases such as Classification, Named Entity Recognition, extractive QA, function calling and ranking. Reach out to us or book a demo to learn more and let us know if you would like us to expand to new use cases.

Can you deploy the model on our cloud?
Plus icon – click on it to expand or minimise a question

No, we do not provide such a service. We can either host the model for you on our infrastructure or share the model binaries with you, so you can host it anywhere.

Where are the models hosted?
Plus icon – click on it to expand or minimise a question

We can either host the model for you on our infrastructure in any region currently supported by AWS or any major cloud provider. Alternatively, we can share the model binaries with you, so you can host it anywhere.

How do you use the data we provide for the training?
Plus icon – click on it to expand or minimise a question

We use the data to define the task during model training (see how it works for details). Once you choose to approve the model evaluation and ask for the model to be deployed, we delete your data and tear down all training instances.

Is my data secure?
Plus icon – click on it to expand or minimise a question

Yes. Your data is encrypted on our servers and then removed after the model is trained.