PP-StructureV3 Collection PP-StructureV3 is a SOTA document parsing solution on OmniDocBench, supporting the conversion of PDFs and do cument images to Markdown and JSON. β’ 17 items β’ Updated Sep 15, 2025 β’ 12
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. β’ 7 items β’ Updated 11 days ago β’ 54
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. β’ 11 items β’ Updated 11 days ago β’ 85
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models β’ 6 items β’ Updated 3 days ago β’ 108
Bolmo: Byteifying the Next Generation of Language Models Paper β’ 2512.15586 β’ Published 17 days ago β’ 13
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory Paper β’ 2504.19413 β’ Published Apr 28, 2025 β’ 36
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper β’ 2503.11576 β’ Published Mar 14, 2025 β’ 125
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper β’ 2512.16093 β’ Published 16 days ago β’ 90
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper β’ 2512.20605 β’ Published 11 days ago β’ 59
β Long-context post-training π§Ά β Collection Resources for post-training LLMs with long-context samples β’ 5 items β’ Updated Sep 14, 2025 β’ 6
VL-JEPA: Joint Embedding Predictive Architecture for Vision-language Paper β’ 2512.10942 β’ Published 23 days ago β’ 22