Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Zhang Xu's picture
4 4

Zhang Xu

texzhang
·
  • CheungXu

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
upvoted a paper 19 days ago
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
upvoted a paper 25 days ago
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
View all activity

Organizations

None yet

Collections 2

LLM
  • On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

    Paper • 2508.05629 • Published Aug 7 • 180
multi-image
  • MANTIS: Interleaved Multi-Image Instruction Tuning

    Paper • 2405.01483 • Published May 2, 2024 • 6
LLM
  • On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

    Paper • 2508.05629 • Published Aug 7 • 180
multi-image
  • MANTIS: Interleaved Multi-Image Instruction Tuning

    Paper • 2405.01483 • Published May 2, 2024 • 6

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs