I have an old multi-GPU node lying around at work...

An old 8-GPU node with 192 GB VRAM and 512 GB RAM sits idle at a workplace. The owner wants to repurpose it for local LLM inference and is seeking suggestions on worthwhile models that couldn't run on a single-card machine. You can run large, multi-GPU models like Llama-3.1 405B or fine-tuned variants of smaller models for improved performance. The machine's specs could support demanding AI workloads that require massive parallel processing.

Key takeaways

The node has 8 NVIDIA Quadro RTX 6000 GPUs, 192 GB VRAM, and 512 GB RAM.
Large, multi-GPU models like Llama-3.1 405B could run on this hardware.
Repurposing for local LLM inference could be worthwhile for demanding AI workloads.

#local-llm #multi-gpu #inference

Read the original

Feed

other4h ago

I have an old multi-GPU node lying around at work...

rr/LocalLLaMA

Key takeaways

The node has 8 NVIDIA Quadro RTX 6000 GPUs, 192 GB VRAM, and 512 GB RAM.
Large, multi-GPU models like Llama-3.1 405B could run on this hardware.
Repurposing for local LLM inference could be worthwhile for demanding AI workloads.

#local-llm #multi-gpu #inference

Read at r/LocalLLaMA

I have an old multi-GPU node lying around at work...

Related

I have an old multi-GPU node lying around at work...

Related