A good model for benchmark but useless in daily due to think too much

by wh1018 - opened 3 days ago

Discussion

wh1018

3 days ago

Always think in minutes

pbadun

3 days ago

Yes, he thinks for a very long time...

leran1995

Nanbeige LLM Lab org 3 days ago

Thanks for the comment.

To clarify, Nanbeige4-3B is primarily a research model aimed at exploring the capability boundaries of small-scale language models. Its strong performance isn’t limited to benchmarks—we’ve also received positive feedback from real-world user cases, such as https://www.reddit.com/r/LocalLLaMA/comments/1pj3q4q/nanbeige43b_lightweight_with_strong_reasoning/

That said, we fully acknowledge the issue you raised: in the current open-source 2511 release version, we haven’t included explicit length control or length-based penalties. This can indeed lead the model to sometimes “overthink.” We recognize this as a clear area for improvement and are actively exploring techniques to encourage more efficient, context-appropriate reasoning in future versions.

leran1995 changed discussion status to closed about 17 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment