Evangelinejy/qwen25-7b-prm_demo-bs2-epoch3.0-ctx4096-ga2-lr1e-05-wr0.1-n4 Text Generation • 333k • Updated 6 days ago • 10
Evangelinejy/qwen25-7b-prm_demo-bs2-epoch3.0-ctx4096-ga2-lr5e-05-wr0.1-n4 333k • Updated 6 days ago • 9
Evangelinejy/llama-32-3b-instruct-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga2-lr1e-05-wr0.1-n4 175k • Updated Nov 22 • 3
Evangelinejy/llama3b-instruct-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga2-lr1e-05-wr0.1-n4 175k • Updated Nov 22 • 3
Evangelinejy/llama3b-base-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 15 • 143
Evangelinejy/llama3b-midtrain-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 15 • 63
Evangelinejy/octothinker-3b-short-base-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 15 • 4
Evangelinejy/octothinker-3b-hybrid-base-open-thoughts114k_math-bs4-epoch1.0-ctx8192-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 15 • 90
Evangelinejy/octothinker-hybrid-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga1-lr5e-06-wr0.1-n4 175k • Updated Nov 12 • 69
Evangelinejy/llama3b-midtrain-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga1-lr5e-06-wr0.1-n4 175k • Updated Nov 12 • 49
Evangelinejy/200b-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga1-lr5e-06-wr0.1-n4 Updated Nov 12
Evangelinejy/llama-32-3b-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga1-lr5e-06-wr0.1-n4 175k • Updated Nov 12 • 54
Evangelinejy/octothinker-short-data_sft_50k_leon_nemotron_thinking-bs4-epoch1.0-ctx8192-ga1-lr5e-06-wr0.1-n4 175k • Updated Nov 12 • 4
Evangelinejy/octothinker-3b-short-base-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 12 • 3
Evangelinejy/octothinker-3b-hybrid-base-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 12 • 4
Evangelinejy/qwen25-15b-data_sft_50k_leon_nemotron-bs4-epoch3.0-ctx4096-ga1-lr1e-05-wr0.1-n4 2B • Updated Nov 10 • 4
Evangelinejy/qwen15b-midtrain-data_sft_50k_leon_nemotron-bs4-epoch3.0-ctx4096-ga1-lr1e-05-wr0.1-n4 2B • Updated Nov 10 • 4
Evangelinejy/llama3b-midtrain-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 10 • 4
Evangelinejy/qwen15b-midtrain-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-06-wr0.1-n4 2B • Updated Nov 10 • 4
Evangelinejy/qwen15b-midtrain-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 2B • Updated Nov 9 • 4
Evangelinejy/qwen25-15b-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 2B • Updated Nov 9 • 3
Evangelinejy/llama-32-3b-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-05-wr0.1-n4 175k • Updated Nov 9 • 3
Evangelinejy/llama-32-3b-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr0.0001-wr0.1-n4 175k • Updated Nov 9 • 4
Evangelinejy/llama-32-3b-data_sft_50k_leon_nemotron-bs4-epoch1.0-ctx4096-ga1-lr1e-06-wr0.1-n4 175k • Updated Nov 9 • 4
Evangelinejy/Qwen2.5-1.5B-Instruct_MATH-lighteval_epoch_1_bs_4_lr_2e-05_length_4096_G_7 Text Generation • 2B • Updated Mar 1 • 9
Evangelinejy/Qwen2.5-1.5B-Instruct_MATH-lighteval_epoch_1_bs_4_lr_2e-05_length_4096_G_8 Updated Mar 1
Evangelinejy/Qwen2.5-1.5B-Instruct_MATH-lighteval_epoch_1_bs_8_lr_2e-05_length_2048_G_16 Text Generation • 2B • Updated Feb 28 • 8