1sec.ai

Tag

#diffusion-models

Every item tagged diffusion-models, newest first.

14 items

Diffusion-Proof: Recipe for Formal Theorem Proving Beyond Auto-Regressive Generation

Researchers propose Diffusion-Proof, a method for formal theorem proving that goes beyond auto-regressive generation. The approach aims to address limitations in current large language models, such as long-range coherence and error compounding. This development could lead to more effective formal math reasoning capabilities. You can explore the method and its potential applications in the paper.

Key takeaways
  • Diffusion-Proof method proposed for formal theorem proving
  • Addresses limitations of auto-regressive generation in LLMs
  • Targets long-range coherence and error compounding issues

DreamReasoner-8B: Block-Size Curriculum Learning for Diffusion Reasoning Models

Researchers developed DreamReasoner-8B, an open-source block diffusion model for long chain-of-thought (CoT) reasoning. They studied how block sizes during training and inference impact performance on long-CoT tasks. The study found that training with large block sizes results in poor reasoning, while smaller block sizes during training and inference improve performance. This informs best practices for scaling block diffusion models to complex reasoning tasks.

Key takeaways
  • DreamReasoner-8B is an open-source block diffusion model for long-CoT reasoning.
  • Training block size significantly impacts long-CoT performance.
  • Smaller block sizes during training and inference improve reasoning.

Sumi: Open Uniform Diffusion Language Model from Scratch

Researchers have proposed Sumi, an open uniform diffusion language model trained from scratch with large parameter scale and token budget. Sumi aims to fill the gap in uniform diffusion models, which currently lack large-scale pretrained counterparts. The model enables flexible generation by allowing any token to be updated at any step. You can study Sumi's architecture and performance on language modeling tasks.

Key takeaways
  • Sumi is a uniform diffusion language model pretrained from scratch.
  • It allows flexible generation by updating any token at any step.
  • No large-scale pretrained uniform diffusion models existed before Sumi.

Spotlight: Synergizing Seed Exploration and Spot GPUs for DiT RL Post-Training

Researchers propose combining seed exploration and spot GPUs to reduce DiT RL post-training costs. Seed exploration selects high-contrast samples to improve convergence, while spot GPUs offer 69-77% lower costs. By synergizing both, the approach reduces overall training costs without increasing wall-clock time. This method benefits builders working with resource-intensive DiT models.

Key takeaways
  • Combining seed exploration and spot GPUs reduces DiT RL post-training costs.
  • Spot GPUs can be 69-77% cheaper than high-end GPUs.
  • Synergized approach doesn't increase wall-clock time.

ReAge3D: Re-Aging 3D Faces with View Consistency

Researchers introduce ReAge3D, a framework for 3D face re-aging that produces detailed, identity-preserving results. It addresses challenges in preserving subtle age-related details across views. The approach first uses a 2D diffusion model, DiffReaging, trained on synthetic data. This enables realistic and controllable 3D face re-aging.

Key takeaways
  • ReAge3D produces highly detailed, identity-preserving 3D face re-aging results.
  • Existing 3D editing methods struggle with re-aging due to view inconsistencies.
  • DiffReaging is a 2D diffusion-based re-aging model trained on synthetic data.

Comfy-Org/ComfyUI

Comfy-Org released ComfyUI, a powerful and modular diffusion model GUI with a graph/nodes interface. ComfyUI offers a flexible and customizable backend for diffusion models. You can use it to build and deploy diffusion-based applications. The project is open-source and available on GitHub.

Key takeaways
  • ComfyUI has a graph/nodes interface for building diffusion models.
  • The project is open-source and available on GitHub.
  • ComfyUI offers a flexible and customizable backend.
modelsJun 10

DiffusionGemma

Google's experimental Gemini Diffusion model has re-emerged as an open-weight Gemma model called DiffusionGemma. The 26B parameter model is licensed under Apache 2 and is available on NVIDIA's NIM cloud API. It runs at 857 tokens/second. You can use it for free via the NIM API.

Key takeaways
  • DiffusionGemma is an open-weight, Apache 2 licensed model.
  • Runs at 857 tokens/second.
  • Available for free on NVIDIA's NIM cloud API.
modelsJun 10

Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Google DeepMind released DiffusionGemma, a model that accelerates local AI inference by 4x for text and image generation. DiffusionGemma targets developers who want to deploy AI models locally on devices with limited resources. The model achieves faster inference through optimized diffusion-based architectures. You can integrate DiffusionGemma into your apps to improve performance and efficiency.

Key takeaways
  • 4x faster local inference for text and image generation.
  • Optimized diffusion-based architecture for efficient deployment.
  • Targets developers building local AI apps on resource-constrained devices.
modelsMar 5

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

Hugging Face introduced Modular Diffusers, a new framework for building and composing diffusion pipelines. The framework provides pre-built components for constructing custom diffusion models, allowing for greater flexibility and control. This enables builders to create tailored solutions for specific use cases. Modular Diffusers aims to simplify the development of diffusion-based applications.

Key takeaways
  • Modular Diffusers offers pre-built components for diffusion pipelines.
  • The framework allows for greater flexibility and control in model construction.
  • Aims to simplify development of diffusion-based applications.
modelsJan 20

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

Overworld has released Waypoint-1, a real-time interactive video diffusion model. The model enables users to interact with video content in real-time. You can explore the model on the Hugging Face platform. This release targets developers interested in building interactive video applications.

Key takeaways
  • Waypoint-1 is a real-time interactive video diffusion model.
  • Available on the Hugging Face platform.
  • Targets developers building interactive video applications.
researchJul 30

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Hugging Face researchers collaborated with Intel to develop a method for running diffusion transformers efficiently on low-memory hardware. They integrated quantization techniques into Diffusers, reducing memory usage by up to 4x. This enables running complex models on resource-constrained devices, expanding access to AI capabilities. You can now deploy diffusion models more efficiently.

Key takeaways
  • Memory usage reduced by up to 4x through quantization.
  • Enables deployment on low-memory hardware.
  • Diffusers library updated with efficient diffusion transformers.
modelsSep 29

Finetune Stable Diffusion Models with DDPO via TRL

The TRL library from Hugging Face now supports DDPO, enabling finetuning of Stable Diffusion models. DDPO is a method for optimizing diffusion models like Stable Diffusion using preferences. This update allows builders to adapt Stable Diffusion models to specific tasks or datasets via finetuning.

Key takeaways
  • TRL library supports DDPO for Stable Diffusion finetuning.
  • DDPO optimizes diffusion models using preference data.
  • Finetuning enables adapting models to specific tasks or datasets.
modelsSep 13

Introducing Würstchen: Fast Diffusion for Image Generation

The Würstchen model, developed by Stability AI and CompVis group at LMU Munich, introduces a fast diffusion process for image generation. Würstchen achieves high-quality results with significantly reduced computational resources. This development enables faster and more efficient image generation, which can benefit various applications. Builders can explore Würstchen for optimizing image generation workflows.

Key takeaways
  • Würstchen reduces computational resources for image generation.
  • Achieves high-quality results with fast diffusion process.
  • Developed by Stability AI and LMU Munich's CompVis group.
otherNov 25

Diffusion Models Live Event

Hugging Face hosted a live event on diffusion models, covering recent advancements and applications. The event featured expert talks and demos. You can now watch the full recordings online. This event is relevant for builders interested in staying up-to-date with the latest developments in diffusion models.

Key takeaways
  • The event covered recent advancements in diffusion models.
  • Expert talks and demos were featured.
  • Full recordings are now available online.