Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 7 days ago • 33
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published 10 days ago • 37
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published 20 days ago • 38
LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions Paper • 2510.08211 • Published Oct 9 • 22
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training Paper • 2510.11712 • Published Oct 13 • 30
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space Paper • 2508.19247 • Published Aug 26 • 43
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models Paper • 2506.19851 • Published Jun 24 • 60
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics Paper • 2506.04308 • Published Jun 4 • 43
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation Paper • 2505.24521 • Published May 30 • 15
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection Paper • 2412.04455 • Published Dec 5, 2024 • 38
MV-Adapter: Multi-view Consistent Image Generation Made Easy Paper • 2412.03632 • Published Dec 4, 2024 • 24