shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20, 2025 • 5
shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20, 2025 • 3
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 3
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 2
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 3
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18, 2025 • 2
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17, 2025 • 5
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17, 2025 • 6