Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 9 days ago • 79
GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies Paper • 2512.02581 • Published 26 days ago • 14
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25 • 31
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems Paper • 2508.01415 • Published Aug 2 • 7
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems Paper • 2508.01415 • Published Aug 2 • 7 • 2
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22 • 40
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3 • 299
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete Paper • 2502.21257 • Published Feb 28 • 2
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 306