ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search Paper • 2406.03816 • Published Jun 6, 2024 • 1
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning Paper • 2401.07950 • Published Jan 15, 2024 • 4
ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding Paper • 2508.19576 • Published Aug 27 • 2