1sec.ai
Back to feed
other1d ago

Cheapest way to run GLM 5.x locally that's not a unified memory system?

rr/LocalLLaMAscore 0.27

The discussion explores cost-effective ways to locally run GLM 5.x models, focusing on 4bit quantization. Users share experiences with CPU-only setups like Sapphire Rapids ES 56core + DDR5 and multi-GPU configurations with partial offloading. The conversation aims to identify viable options for running large models like GLM 5.x outside unified memory systems. You can consider various hardware configurations for efficient local deployment.

Key takeaways

  • Sapphire Rapids ES 56core + DDR5 is a potential option for running GLM 5.x locally.
  • Multi-GPU setups with partial offloading are also being explored.
  • The discussion is not limited to GLM 5.x, but also applies to similarly sized models.
other1d ago

Cheapest way to run GLM 5.x locally that's not a unified memory system?

The discussion explores cost-effective ways to locally run GLM 5.x models, focusing on 4bit quantization. Users share experiences with CPU-only setups like Sapphire Rapids ES 56core + DDR5 and multi-GPU configurations with partial offloading. The conversation aims to identify viable options for running large models like GLM 5.x outside unified memory systems. You can consider various hardware configurations for efficient local deployment.

Key takeaways

  • Sapphire Rapids ES 56core + DDR5 is a potential option for running GLM 5.x locally.
  • Multi-GPU setups with partial offloading are also being explored.
  • The discussion is not limited to GLM 5.x, but also applies to similarly sized models.