Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values Paper • 2510.20187 • Published Oct 23 • 18
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding Paper • 2408.08252 • Published Aug 15, 2024 • 1
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning Paper • 2305.04819 • Published May 8, 2023