Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments Paper • 2508.08791 • Published Aug 12, 2025 • 16
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published May 12, 2025 • 11
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition Paper • 2402.14568 • Published Feb 22, 2024 • 1
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios Paper • 2401.00741 • Published Jan 1, 2024 • 1
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models Paper • 2303.10420 • Published Mar 18, 2023 • 1
Empirical Insights on Fine-Tuning Large Language Models for Question-Answering Paper • 2409.15825 • Published Sep 24, 2024 • 1
TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use Paper • 2412.15495 • Published Dec 20, 2024 • 1
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Paper • 2501.02506 • Published Jan 5, 2025 • 10