© disitil labs 2024
It is the model used to generate additional training data we use for fine-tuning. We use best-in-class open-source models as our “Teacher” models. Currently LLaMA 3.1 405B.
This is the pre-trained language model that we specialize for your needs. The model we use varies depending on the use case – we find the smallest possible model that meets your quality threshold.
The more examples the better but we usually recommend using a few dozen diverse examples for the specialized model to do well on a variety of inputs.
Uploading more samples is always better as long as there are no duplicates.
You should write the task description in the same way you would write a LLM prompt to solve your particular task. We have found that task descriptions used as an in-context learning prompts usually work best on our platform.
We isolate a test set from your training set to evaluate the model using task-appropriate metrics – for example accuracy and F1 score for classification.
We aim for the trained models to have equal or higher accuracy than modern large language models such as Llama405b, GPT4, or else.
We can train models up to 8B parameters as student models.
This largely depends on the size of the specialized model you train. Please reach out to us for more details.
We support the most common NLP tasks such as classification, Named Entity Recognition, extractive QA, summarization, function calling and ranking. Reach out to us or book a demo to learn more and let us know if you would like us to expand to new use cases.
No, we do not provide such a service. We can either host the model for you on our infrastructure or share the model binaries with you, so you can host it anywhere.
We can either host the model for you on our infrastructure in any region currently supported by AWS or any major cloud provider. Alternatively, we can share the model binaries with you, so you can host it anywhere.
We use the data to define the task during model training (see how it works for details). Once you choose to approve the model evaluation and ask for the model to be deployed, we delete your data and tear down all training instances.
Yes. Your data is encrypted on our servers and then removed after the model is trained.