Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.02547

Sports Predictive Modeling

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

Reinforcement learning

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 129
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
PRewrite: Prompt Rewriting with Reinforcement Learning

Paper • 2401.08189 • Published Jan 16, 2024
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Paper • 2509.11543 • Published Sep 15 • 47

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

Large Language Models

Universal Deep Research: Bring Your Own Model and Strategy

Paper • 2509.00244 • Published Aug 29 • 13
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1 • 39
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

about 1 month ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 121

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Reinforcement Learning Foundations for Deep Research Systems: A Survey

Paper • 2509.06733 • Published Sep 8 • 32

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2 • 25
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 75

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

Sports Predictive Modeling

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

Large Language Models

Universal Deep Research: Bring Your Own Model and Strategy

Paper • 2509.00244 • Published Aug 29 • 13
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1 • 39
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

Reinforcement learning

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101

about 1 month ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 121

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Reinforcement Learning Foundations for Deep Research Systems: A Survey

Paper • 2509.06733 • Published Sep 8 • 32

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 129
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
PRewrite: Prompt Rewriting with Reinforcement Learning

Paper • 2401.08189 • Published Jan 16, 2024
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Paper • 2509.11543 • Published Sep 15 • 47

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2 • 25
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225
Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 75

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 225

Previous
1
2
3
4
5
6
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs