Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 211
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 16 days ago • 108
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published 16 days ago • 199
Domain Reasoners: CoT Series Collection Specialized reasoning models for different domains—all using the same step-by-step Chain-of-Thought format. Over 20,000 total downloads combined. • 3 items • Updated 17 days ago • 5
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 28 items • Updated Sep 1, 2025 • 59
Finetunes | SLMs and LLMs Collection Various variants of LLMs finetuned using proprietary data. • 26 items • Updated Aug 17, 2025 • 4
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3, 2025 • 305
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 247
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated 11 days ago • 140
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 252
GraphRAG Papers Collection Research relating graphs and GenAI. For discussion, find dedicated threads on https://discord.gg/graphrag • 52 items • Updated Sep 17, 2025 • 49
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15, 2025 • 222
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 480
LightRAG: Simple and Fast Retrieval-Augmented Generation Paper • 2410.05779 • Published Oct 8, 2024 • 27