tran minh thang
thangtm
·
AI & ML interests
None yet
Recent Activity
updated
a collection
1 day ago
reasoning_model
updated
a collection
1 day ago
reasoning_model
updated
a collection
1 day ago
reasoning_model
Organizations
None yet
reasoning_model
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 91 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 101 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 93
RL
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 131 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 93
RAG
OCR
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 17.1k • 1.43k -
nanonets/Nanonets-OCR2-3B
Image-Text-to-Text • 4B • Updated • 97.1k • 466 -
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text • 3B • Updated • 4.14M • 3.01k -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 139
flow_matching_model
DLM
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 117 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 127 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 77
ARC
Reduce_thinking
-
FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
33B • Updated • 41 • 129 -
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Paper • 2506.08343 • Published • 54 -
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Paper • 2509.26226 • Published • 33
data
flow_matching_model
reasoning_model
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 91 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 101 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 93
DLM
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 117 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 127 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 77
RL
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 131 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 93
ARC
RAG
Reduce_thinking
-
FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
33B • Updated • 41 • 129 -
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Paper • 2506.08343 • Published • 54 -
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Paper • 2509.26226 • Published • 33
OCR
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 17.1k • 1.43k -
nanonets/Nanonets-OCR2-3B
Image-Text-to-Text • 4B • Updated • 97.1k • 466 -
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text • 3B • Updated • 4.14M • 3.01k -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 139