Tag

#distillation

Every item tagged distillation, newest first.

4 items

Does anyone have enough compute to make a distillation dataset out of GLM5.2?

A reddit user is asking if anyone has sufficient compute to create a distillation dataset from GLM-5.2, which could be used to train smaller models like Qwen-3.5. The proposed dataset would contain 700k-1M examples. This would benefit the community by enabling better training of smaller models.

Key takeaways

GLM-5.2 proposed as source for distillation dataset.
700k-1M examples suggested for dataset size.
Smaller models like Qwen-3.5 could benefit from dataset.

rr/LocalLLaMA#distillation #open-weights #local-llm

other1d

Be wary of Qwen/Claude distillations - they're often worse than the base model

A Reddit user warns that distilled/finetuned models like Qwopus, based on Qwen or Claude, often perform worse than their base models. The user aims to inform, not criticize, creators of these models. This issue may apply to other distilled models, such as Gemma 4/Claude. You should evaluate these models carefully before using them.

Key takeaways

Distilled Qwen/Claude models can be worse than base models.
Issue may apply to other distilled models like Gemma 4/Claude.
User aims to inform, not criticize, model creators.

rr/LocalLLaMA#local-llm #distillation #model-performance

modelsMar 3

PRX Part 3 — Training a Text-to-Image Model in 24h!

The PRX framework trains a text-to-image model in 24 hours with 1.2M images and reaches 30% better performance than DreamU on benchmarks. This approach uses a novel distillation method and multi-stage training to accelerate model development. You can deploy the resulting model for image generation tasks. The method's efficiency enables faster iteration and lower costs for builders.

Key takeaways

Trains a text-to-image model in 24 hours with 1.2M images.
30% better performance than DreamU on benchmarks.
Uses novel distillation and multi-stage training methods.

HHugging Face Blog#text-to-image #model-training #distillation

modelsDec 18

2023, year of open LLMs

The 2023 landscape saw major open LLMs emerge, including Llama, Alpaca, and Vicuna. These models drove progress in areas like fine-tuning, distillation, and efficiency. You can now deploy capable open models locally or via cloud services.

Key takeaways

Multiple major open LLMs released in 2023.
Open models drove progress in fine-tuning and efficiency.
Capable open models can be deployed locally or in the cloud.

HHugging Face Blog#open-llms #fine-tuning #distillation