On Thursday, French lab Mistral AI launched Small 3, which the company calls “the most efficient model of its category” and says is optimized for latency.
Mistral says Small 3 can compete with Llama 3.3 70B and Qwen 32B, among other large models, and it’s “an excellent open replacement for opaque proprietary models like GPT4o-mini.”
Like Mistral’s other models, the 24B-parameter Small 3 is open-source, released under the Apache 2.0 license.
Designed for local use, Small 3 provides a base for building reasoning abilities, Mistral says. “Small 3 excels in scenarios where quick, accurate responses are critical,” the release continues, noting that the model has fewer layers than comparable models, which helps its speed.
The model achieved better than 81% accuracy on the MMLU benchmark test, and was not trained with reinforcement learning (RL) or synthetic data, which Mistral says makes it “earlier in the model production pipeline” than DeepSeek R1.
“Our instruction-tuned model performs competitively with open weight models three times its size and with proprietary GPT4o-mini model across Code, Math, General knowledge and Instruction following benchmarks,” the announcement notes.
Using a third-party vendor, Mistral had human evaluators test Small 3 with more than 1,000 coding and generalist prompts. A majority of testers preferred Small 3 to Gemma-2 27B and Qwen-2.5 32B, but numbers were more evenly split when Small 3 went up against Llama-3.3 70B and GPT-4o mini. Mistral acknowledged the discrepancies in human judgment that make this test differ from standardized public benchmarks.
Explore IT Tech News for the latest advancements in Information Technology & insightful updates from industry experts!