DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 84
Budget-Aware Tool-Use Enables Effective Agent Scaling Paper • 2511.17006 • Published Nov 21, 2025 • 29
Budget-Aware Tool-Use Enables Effective Agent Scaling Paper • 2511.17006 • Published Nov 21, 2025 • 29 • 2
VISTA: A Test-Time Self-Improving Video Generation Agent Paper • 2510.15831 • Published Oct 17, 2025 • 20
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30, 2025 • 43
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published Apr 17, 2025 • 26
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting Paper • 2503.00784 • Published Mar 2, 2025 • 13