Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 119
Running on CPU Upgrade Featured 2.79k The Smol Training Playbook 📚 2.79k The secrets to building world-class LLMs
deepseek-ai/DeepSeek-V3.2-Exp Text Generation • 685B • Updated Nov 18, 2025 • 71.6k • • 930
nvidia/NVIDIA-Nemotron-Nano-9B-v2 Text Generation • 9B • Updated about 1 month ago • 64k • 457
ByteDance-Seed/Seed-OSS-36B-Instruct Text Generation • 36B • Updated Aug 26, 2025 • 8.66k • 469
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 273
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28, 2025 • 44
Inference-Time Hyper-Scaling with KV Cache Compression Paper • 2506.05345 • Published Jun 5, 2025 • 27