other14h ago

My GLM-5.2-FP8 HGX-H200 SGLang docker deploy config

rr/LocalLLaMAscore 0.33

A user shared a Docker deployment configuration for running GLM-5.2 with SGLang on an HGX-H200 GPU. The setup uses 8 GPU tensor cores and allocates a fraction of system memory. This configuration may help others deploy GLM-5.2 locally with similar hardware.

Key takeaways

Uses lmsysorg/sglang:latest Docker image.
Configured for HGX-H200 GPU with 8 tensor cores.
Allocates a fraction of system memory for the model.

#docker #local-llm #deployment

Read the original

other14h ago

My GLM-5.2-FP8 HGX-H200 SGLang docker deploy config

A user shared a Docker deployment configuration for running GLM-5.2 with SGLang on an HGX-H200 GPU. The setup uses 8 GPU tensor cores and allocates a fraction of system memory. This configuration may help others deploy GLM-5.2 locally with similar hardware.

Key takeaways

Uses lmsysorg/sglang:latest Docker image.
Configured for HGX-H200 GPU with 8 tensor cores.
Allocates a fraction of system memory for the model.

#docker #local-llm #deployment

Read at r/LocalLLaMA