0xATS's picture

15 16

0xATS

0xATS

·

AI & ML interests

None yet

Organizations

None yet

upvoted 15 papers 4 months ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published Aug 5 • 59

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Paper • 2508.00414 • Published Aug 1 • 93

GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay

Paper • 2508.04676 • Published Aug 6 • 4

ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants

Paper • 2508.03936 • Published Aug 5 • 9

HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches

Paper • 2508.08088 • Published Aug 11 • 29

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published Aug 13 • 15

Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment

Paper • 2508.07750 • Published Aug 11 • 19

UI-Venus Technical Report: Building High-performance UI Agents with RFT

Paper • 2508.10833 • Published Aug 14 • 44

PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts

Paper • 2508.09848 • Published Aug 13 • 67

Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis

Paper • 2508.04699 • Published Aug 6 • 2

PRvL: Quantifying the Capabilities and Risks of Large Language Models for PII Redaction

Paper • 2508.05545 • Published Aug 7 • 2

Learning to Reason for Factuality

Paper • 2508.05618 • Published Aug 7 • 6

InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities

Paper • 2508.05496 • Published Aug 7 • 9

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

Paper • 2508.02120 • Published Aug 4 • 19

Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability

Paper • 2508.04017 • Published Aug 6 • 11