Rethinking Dataset Distillation for Classification: Do Distilled Sets Outperform Coresets?
Researchers re-examined dataset distillation methods for classification tasks, finding that distilled sets often don't outperform coresets, a simpler baseline. The study used consistent evaluation protocols to isolate the effectiveness of distilled data. Results suggest that dataset distillation may not be as effective as previously thought, and builders should carefully evaluate its benefits. The findings have implications for data-centric machine learning approaches.
- Distilled sets don't consistently outperform coresets in classification tasks.
- Inconsistent evaluation protocols have hindered prior comparisons.
- Coresets remain a strong, simple baseline for data compression.