Collections
Discover the best community collections!
Collections including paper arxiv:2511.18822
-
Adversarial Flow Models
Paper • 2511.22475 • Published • 21 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 26 -
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Paper • 2511.22396 • Published • 4 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 13 -
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper • 2312.04410 • Published • 15 -
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper • 2312.04461 • Published • 62 -
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Paper • 2401.02955 • Published • 23
-
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield
Paper • 2511.22677 • Published • 23 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 26 -
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
Paper • 2512.00425 • Published • 47 -
Learning Eigenstructures of Unstructured Data Manifolds
Paper • 2512.01103 • Published • 4
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 77
-
Adversarial Flow Models
Paper • 2511.22475 • Published • 21 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 26 -
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Paper • 2511.22396 • Published • 4 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16
-
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield
Paper • 2511.22677 • Published • 23 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 26 -
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
Paper • 2512.00425 • Published • 47 -
Learning Eigenstructures of Unstructured Data Manifolds
Paper • 2512.01103 • Published • 4
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 77
-
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 13 -
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper • 2312.04410 • Published • 15 -
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper • 2312.04461 • Published • 62 -
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Paper • 2401.02955 • Published • 23