stepfun-ai/GELab-Zero-4B-preview Image-Text-to-Text β’ 4B β’ Updated Dec 19, 2025 β’ 5.47k β’ 148
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper β’ 2601.05593 β’ Published 30 days ago β’ 83
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards Paper β’ 2509.24981 β’ Published Sep 29, 2025 β’ 29