poolside/Laguna-M.1 · Hugging Face - 225B-A23B
poolside released Laguna M.1, a 225B parameter Mixture-of-Experts model targeting agentic coding and long-horizon tasks. The model has 23B activated parameters per token and uses a 70-layer MoE transformer architecture. It features high-capacity expert routing with 256 experts and top-k=16 routing. You can explore Laguna M.1 on the Hugging Face platform.
Key takeaways
- 225B total parameters, 23B activated per token
- 70-layer MoE transformer architecture
- 256 experts with top-k=16 routing