arxiv:2505.15055
Hongli Zhou
Joe-Hall-Lee
AI & ML interests
Large Language Models
Recent Activity
upvoted
a
paper
about 2 months ago
Lost in Benchmarks? Rethinking Large Language Model Benchmarking with
Item Response Theory
liked
a Space
4 months ago
allenai/reward-bench
authored
a paper
5 months ago
Lost in Benchmarks? Rethinking Large Language Model Benchmarking with
Item Response Theory
Organizations
None yet