-
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 131 -
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey
Paper • 2511.07448 • Published • 2 -
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper • 2511.16043 • Published • 108
Jake De
goforit123
AI & ML interests
None yet
Recent Activity
updated
a collection
about 1 month ago
RL
updated
a collection
about 1 month ago
LLM
updated
a model
about 1 month ago
goforit123/poca-SoccerTwos
Organizations
None yet
RL
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 61 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
LLM
-
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 131 -
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey
Paper • 2511.07448 • Published • 2 -
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper • 2511.16043 • Published • 108
RL
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 61 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
models
15
goforit123/poca-SoccerTwos
Reinforcement Learning
•
Updated
•
9
goforit123/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning
•
Updated
goforit123/custom-ppo-LunarLander-v2
Reinforcement Learning
•
Updated
goforit123/goforit123
Updated
goforit123/ppo-Pyramids
Reinforcement Learning
•
Updated
•
3
goforit123/Pixelcopter-PLE-v0
Reinforcement Learning
•
Updated
goforit123/ppo-SnowballTarget
Reinforcement Learning
•
Updated
•
4
goforit123/CartPole-v1
Reinforcement Learning
•
Updated
goforit123/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
4
goforit123/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
•
1
datasets
0
None public yet