EAPO: Enhancing Policy Optimization with On-Demand Expert Assistance Paper • 2509.23730 • Published Sep 28 • 2
MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models Paper • 2508.06009 • Published Aug 8 • 16