1sec.ai

Tag

#gpu-optimization

Every item tagged gpu-optimization, newest first.

2 items

llama.cpp - how to free up even more space on your GPU

llama.cpp has improved RAM usage, eliminating memory leaks and allowing efficient GPU usage with models like Qwen3.6-27B-UD-Q5_K_XL. The author seeks advice on further reducing memory usage to increase context size on their eGPU setup with a 3090. They currently use --n-gpu-layers 99 --no-mmap --mlock. You can experiment with adjusting these parameters or explore quantization techniques.

Key takeaways
  • llama.cpp has stable RAM usage with no memory leaks.
  • --n-gpu-layers 99 --no-mmap --mlock config avoids regular RAM usage.
  • Seeking tips to free up more memory for larger context sizes.
otherMay 21

Hugging Face on AMD Instinct MI300 GPU

Hugging Face has partnered with AMD to optimize model performance on the Instinct MI300 GPU. This collaboration aims to improve efficiency and scalability for AI workloads. You can expect better performance and lower costs for your AI applications. The Instinct MI300 is designed for high-performance computing and AI tasks.

Key takeaways
  • Hugging Face partners with AMD for GPU optimization.
  • Instinct MI300 targets high-performance AI computing.
  • Better performance and lower costs for AI applications.