A good model for benchmark but useless in daily due to think too much

#2
by wh1018 - opened

Always think in minutes

Yes, he thinks for a very long time...

Nanbeige LLM Lab org

Thanks for the comment.

To clarify, Nanbeige4-3B is primarily a research model aimed at exploring the capability boundaries of small-scale language models. Its strong performance isn’t limited to benchmarks—we’ve also received positive feedback from real-world user cases, such as https://www.reddit.com/r/LocalLLaMA/comments/1pj3q4q/nanbeige43b_lightweight_with_strong_reasoning/

That said, we fully acknowledge the issue you raised: in the current open-source 2511 release version, we haven’t included explicit length control or length-based penalties. This can indeed lead the model to sometimes “overthink.” We recognize this as a clear area for improvement and are actively exploring techniques to encourage more efficient, context-appropriate reasoning in future versions.

leran1995 changed discussion status to closed

Sign up or log in to comment