Leading AI firms are using 'distillation' to create cheaper and more efficient models, following a technique pioneered by DeepSeek. This process involves using a large 'teacher' model to train smaller 'student' models, making AI capabilities more accessible and cost-effective.
AI researchers at Stanford and the University of Washington trained an AI 'reasoning' model named s1 for under $50 using cloud compute credits. The model, which performs similarly to OpenAI’s o1 and DeepSeek’s R1, is available on GitHub. It was developed using distillation from Google’s Gemini 2.0 Flash Thinking Experimental model and demonstrates strong performance on benchmarks.