The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 4 days ago • 60
ViMoGen Collection The Quest for Generalizable Motion Generation: Data, Model, and Evaluation • 2 items • Updated 9 days ago
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 11 days ago • 70
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated 17 days ago • 1.89k • 7
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated 17 days ago • 1.89k • 7
SenseNova-SI Collection Scaling Spatial Intelligence with Multimodal Foundation Models • 8 items • Updated 19 days ago • 14
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17 • 45
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published Nov 17 • 52
sensenova/SenseNova-SI-1.1-InternVL3-2B Image-Text-to-Text • 2B • Updated 18 days ago • 4.86k • 6
sensenova/SenseNova-SI-1.1-InternVL3-8B Image-Text-to-Text • 8B • Updated 15 days ago • 12.7k • 13