1sec.ai

Tag

#state-space-models

Every item tagged state-space-models, newest first.

1 item

Ternary Mamba: Grouped Quantization-Aware Training of W1.58A16 State Space Models

Researchers developed Ternary Mamba, a method for compressing State Space Models like Mamba-2 through grouped quantization-aware training. This approach enables significant memory reduction without extensive retraining from scratch. The compressed model achieves 48.1% zero-shot accuracy across 7 tasks, making it suitable for edge deployment where memory is limited. Builders can apply this method to optimize models for low-memory environments.

Key takeaways
  • Ternary Mamba compresses Mamba-2 1.3B model from 2,687 MB to 744 MB.
  • Achieves 48.1% zero-shot accuracy on 7-task average.
  • Reduces token budget by 1,000x compared to training from scratch.