Yaswanth Kumar

yaswanth2761

AI & ML interests

NLP, Computer Vision, Deep Learning,Reinforcement Learning

Recent Activity

upvoted an article 2 months ago

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

liked a Space 4 months ago

HuggingFaceFW/blogpost-fineweb-v1

liked a Space 4 months ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

upvoted an article 2 months ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Dec 15, 2025

•

108

liked 2 Spaces 4 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.29k

Download a trillion‑token web text dataset for LLM training

The Smol Training Playbook

📚

2.98k

The secrets to building world-class LLMs

liked a model 4 months ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • Updated Nov 4, 2025 • 2.99M • 3.15k

liked 2 models 6 months ago

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 5.54M • • 4.36k

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.33M • • 4.49k

upvoted an article 7 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

756

upvoted a paper 7 months ago

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2, 2025 • 69

liked a dataset 7 months ago

bird-of-paradise/transformer-from-scratch-tutorial

Updated Jan 12 • 91 • 36

liked a Space 12 months ago

The Ultra-Scale Playbook

🌌

3.69k

The ultimate guide to training LLM on large GPU Clusters

Yaswanth Kumar

AI & ML interests

Recent Activity

Organizations

yaswanth2761's activity

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

FineWeb: decanting the web for the finest text data at scale

The Smol Training Playbook

SmolLM3: smol, multilingual, long-context reasoner

The Ultra-Scale Playbook