When Models Outthink Their Safety: Mitigating Self-Jailbreak in Large Reasoning Models with Chain-of-Guardrails Paper • 2510.21285 • Published Oct 24 • 4