Optimize Any Topology: A Foundation Model for Shape- and Resolution-Free Structural Topology Optimization Paper • 2510.23667 • Published Oct 26 • 2
BIKED++: A Multimodal Dataset of 1.4 Million Bicycle Image and Parametric CAD Designs Paper • 2402.05301 • Published Feb 7, 2024
CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation Paper • 2505.14646 • Published May 20
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT Paper • 2110.01900 • Published Oct 5, 2021
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model Paper • 2210.00705 • Published Oct 3, 2022
USAD: Universal Speech and Audio Representation via Distillation Paper • 2506.18843 • Published Jun 23 • 12
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning Paper • 2505.15034 • Published May 21 • 5
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19 • 17
Inference-Time Policy Steering through Human Interactions Paper • 2411.16627 • Published Nov 25, 2024 • 3
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains Paper • 2505.03981 • Published May 6 • 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published Mar 26 • 10
Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making Paper • 2305.17588 • Published May 27, 2023
Self-Verification Improves Few-Shot Clinical Information Extraction Paper • 2306.00024 • Published May 30, 2023
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once Paper • 2405.12971 • Published May 21, 2024 • 2
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation Paper • 2403.08002 • Published Mar 12, 2024 • 1
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning Paper • 2502.19655 • Published Feb 27