Yaqi Duan's picture

2

Yaqi Duan

duanyq

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting

upvoted a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

authored a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

View all activity

Organizations

None yet

Papers 1

arxiv:2502.04270

models 0

None public yet

datasets 0

None public yet