Blog & Demos
Tutorials, case studies, benchmarks, and open-source demos — everything you need to build with small language models.
We Benchmarked 12 Small Language Models Across 8 Tasks to Find the Best Base Model for Fine-Tuning
A systematic benchmark of 12 small language models across 8 tasks reveals Qwen3-4B as the best for fine-tuning, with fine-tuned models matching or exceeding a 120B+ teacher. Smaller models like Llama-3.2-1B show the highest tunability.
Gitara: How we trained a 3B Function-Calling Git Agent for Local Use
We fine-tuned a small, tool-calling language model to turn plain-English questions into git commands with the accuracy of a cloud LLM.
Resume Roaster AI: Brutally Honest Resume Critique with a Local SLM
A fine-tuned Llama-3.2-3B model that generates sarcastic resume critiques and professional improvement suggestions. Runs entirely locally to keep your personal data private.
distil-commit-bot: AI-Powered Commit Messages for TypeScript
A fine-tuned 0.6B parameter SLM that generates commit messages for TypeScript codebases. Runs locally via Ollama, achieving 90% accuracy compared to a 120B teacher model — at 200x smaller size.
distil-localdoc.py: Automatic Python Documentation Generation
A fine-tuned Qwen3 0.6B model that generates complete, properly formatted Google-style docstrings for your Python code — runs locally to keep your proprietary code secure.
distil-expenses: Local Personal Expense Summaries with SLMs
Fine-tuned Llama 3.2 models (1B and 3B) for personal expense analysis. Runs locally via Ollama — query your spending data with natural language while keeping your financial data completely private.
distil NPC: Small Language Models for Video Game Characters
A family of small language models specialised for conversational NPCs in video games. Enables natural language interaction with game characters running entirely on-device, no network required.
distil-PII: Family of PII Redaction SLMs
We trained and released a family of small language models specialized for policy-aware PII redaction that dramatically outperform their pre-trained counterparts.
distil labs: Benchmarking the Platform
Benchmarking distil labs' distillation pipeline across classification, information extraction, QA, and tool-calling tasks, showing that compact SLMs consistently match or exceed teacher LLMs.