view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 430
Running on CPU Upgrade Featured 2.82k The Smol Training Playbook 📚 2.82k The secrets to building world-class LLMs
Running 3.63k The Ultra-Scale Playbook 🌌 3.63k The ultimate guide to training LLM on large GPU Clusters
deepseek-ai/DeepSeek-V3-0324 Text Generation • 685B • Updated Mar 27, 2025 • 246k • • 3.08k
nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 5.35k • 640