Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5 • 59
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1 • 93
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay Paper • 2508.04676 • Published Aug 6 • 4
ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants Paper • 2508.03936 • Published Aug 5 • 9
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches Paper • 2508.08088 • Published Aug 11 • 29
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning Paper • 2508.09726 • Published Aug 13 • 15
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published Aug 11 • 19
UI-Venus Technical Report: Building High-performance UI Agents with RFT Paper • 2508.10833 • Published Aug 14 • 44
PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts Paper • 2508.09848 • Published Aug 13 • 67
Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis Paper • 2508.04699 • Published Aug 6 • 2
PRvL: Quantifying the Capabilities and Risks of Large Language Models for PII Redaction Paper • 2508.05545 • Published Aug 7 • 2
InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities Paper • 2508.05496 • Published Aug 7 • 9
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models Paper • 2508.02120 • Published Aug 4 • 19
Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability Paper • 2508.04017 • Published Aug 6 • 11