FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5, 2025 • 33
OpenMathReasoning Collection Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 17 days ago • 46
Heimdall: test-time scaling on the generative verification Paper • 2504.10337 • Published Apr 14, 2025 • 33