A collection of mutiple benchmarks for large reasoning model evaluation
datasets-and-models
non-profit
AI & ML interests
None defined yet.
Recent Activity
models
37
guanning-ai/20260102-p_normalization_step4000
0.4B
•
Updated
•
29
guanning-ai/20260102-grpo_step4000
0.4B
•
Updated
•
58
guanning-ai/smollm-gsm8k-pnorm-ckpt4900
0.4B
•
Updated
•
11
guanning-ai/smollm-gsm8k-grpo-ckpt3900
0.4B
•
Updated
•
6
guanning-ai/smollm-gsm8k-grpo-ckpt1000
0.4B
•
Updated
•
17
guanning-ai/maze_sft_weights_1207
Updated
guanning-ai/Gai
Updated
guanning-ai/1027-math4b-bz1024-pposz128-rollout4-seed20
Updated
guanning-ai/1024-1.5b-knk23-debug1004
Updated
guanning-ai/1024-jspo-4b-lr1e-6-bz64-pposz32-rollout4-seed6
Updated
datasets
132
guanning-ai/minervamath
Viewer
•
Updated
•
272
guanning-ai/smollm-gsm8k-data-1024
Viewer
•
Updated
•
7.65M
•
9
guanning-ai/gsm8k-metamath
Viewer
•
Updated
•
160k
•
24
guanning-ai/gsm8k-mumath
Viewer
•
Updated
•
92k
•
18
guanning-ai/gsm8k-mugglemath
Viewer
•
Updated
•
157k
•
9
guanning-ai/openr1-93K
Viewer
•
Updated
•
93.7k
•
24
guanning-ai/Polaris-53K
Viewer
•
Updated
•
53.3k
•
14
guanning-ai/maze_11x11_1m
Updated
•
7
guanning-ai/maze_13x13_1m
Updated
•
5
guanning-ai/maze_15x15_1m
Updated
•
12