MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12 • 68
In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published Nov 24 • 30
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Paper • 2512.17909 • Published 7 days ago • 35