1 4 49

Eric Xian

ericxian1997

AI & ML interests

None yet

Recent Activity

liked a dataset 3 days ago

nvidia/Nemotron-Pretraining-Specialized-v1

upvoted a paper about 2 months ago

Virtual Width Networks

liked a model about 2 months ago

briaai/FIBO

View all activity

Organizations

liked a dataset 3 days ago

nvidia/Nemotron-Pretraining-Specialized-v1

Viewer • Updated 11 days ago • 60.7M • 8.91k • 59

upvoted a paper about 2 months ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 37

liked a model about 2 months ago

briaai/FIBO

Text-to-Image • Updated Nov 13, 2025 • 13.4k • • 291

upvoted a paper 2 months ago

Does your data spark joy? Performance gains from domain upsampling at the end of training

Paper • 2406.03476 • Published Jun 5, 2024 • 4

liked a dataset 3 months ago

yczhuang/Hephaestus-Forge

Viewer • Updated Sep 8, 2025 • 3.81k • 1.52k • 1

upvoted a collection 3 months ago

DeepSeek-V3.2

Collection

4 items • Updated Dec 1, 2025 • 511

liked a model 4 months ago

agentica-org/DeepScaleR-1.5B-Preview

Text Generation • 2B • Updated Apr 9, 2025 • 21.3k • 577

liked a dataset 4 months ago

nvidia/Nemotron-CC-v2

Viewer • Updated 11 days ago • 8.79B • 49.2k • 96

liked a model 4 months ago

ByteDance-Seed/Seed-OSS-36B-Instruct

Text Generation • 36B • Updated Aug 26, 2025 • 9.14k • 469

liked 3 models 6 months ago

liked a Space 7 months ago

TxT360: Trillion Extracted Text

📖

131

Explore and analyze the TxT360 dataset for LLM pre-training

liked a dataset 9 months ago

HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 14.6k • 408

upvoted a paper 9 months ago

MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion

Paper • 2502.04235 • Published Feb 6, 2025 • 23

liked a dataset 10 months ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 1.24M • 88

liked a model 10 months ago

jinaai/jina-embeddings-v3

Feature Extraction • 0.6B • Updated Feb 24, 2025 • 4.61M • 1.11k

liked a Space 10 months ago

The Ultra-Scale Playbook

🌌

3.62k

The ultimate guide to training LLM on large GPU Clusters

liked a dataset 10 months ago

PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21, 2025 • 1.99M • 1.22k • 60

liked a model 11 months ago

perplexity-ai/r1-1776

Text Generation • 671B • Updated Feb 26, 2025 • 570 • 2.33k