Paprika Collection related to the paper, "Training a Generally Curious Agent" (Project page: https://paprika-llm.github.io/) ftajwar/paprika_Meta-Llama-3.1-8B-Instruct Text Generation • 8B • Updated Mar 5, 2025 • 1 • 2 ftajwar/paprika_SFT_dataset Viewer • Updated Mar 13, 2025 • 17.2k • 10 • 3 ftajwar/paprika_preference_dataset Viewer • Updated Mar 13, 2025 • 5.26k • 10 • 1 ftajwar/paprika_Meta-Llama-3.1-8B-Instruct_SFT_only Text Generation • 8B • Updated Nov 4, 2025 • 1
Self-Rewarding-LLM-Training Datasets from the paper: "Can Large Reasoning Models Self-Train?" Project Information: https://self-rewarding-llm-training.github.io/ ftajwar/deduplicated_dapo_dataset Viewer • Updated May 28, 2025 • 17.4k • 76 • 1 ftajwar/srt_test_dataset Viewer • Updated May 28, 2025 • 273 • 45 ftajwar/dapo_easy_one_third_sorted_by_pass_rate Viewer • Updated May 28, 2025 • 5.8k • 21 ftajwar/dapo_easy_one_third_sorted_by_frequency_of_majority_answer Viewer • Updated May 28, 2025 • 5.8k • 26
ftajwar/dapo_easy_one_third_sorted_by_frequency_of_majority_answer Viewer • Updated May 28, 2025 • 5.8k • 26
Paprika Collection related to the paper, "Training a Generally Curious Agent" (Project page: https://paprika-llm.github.io/) ftajwar/paprika_Meta-Llama-3.1-8B-Instruct Text Generation • 8B • Updated Mar 5, 2025 • 1 • 2 ftajwar/paprika_SFT_dataset Viewer • Updated Mar 13, 2025 • 17.2k • 10 • 3 ftajwar/paprika_preference_dataset Viewer • Updated Mar 13, 2025 • 5.26k • 10 • 1 ftajwar/paprika_Meta-Llama-3.1-8B-Instruct_SFT_only Text Generation • 8B • Updated Nov 4, 2025 • 1
Self-Rewarding-LLM-Training Datasets from the paper: "Can Large Reasoning Models Self-Train?" Project Information: https://self-rewarding-llm-training.github.io/ ftajwar/deduplicated_dapo_dataset Viewer • Updated May 28, 2025 • 17.4k • 76 • 1 ftajwar/srt_test_dataset Viewer • Updated May 28, 2025 • 273 • 45 ftajwar/dapo_easy_one_third_sorted_by_pass_rate Viewer • Updated May 28, 2025 • 5.8k • 21 ftajwar/dapo_easy_one_third_sorted_by_frequency_of_majority_answer Viewer • Updated May 28, 2025 • 5.8k • 26
ftajwar/dapo_easy_one_third_sorted_by_frequency_of_majority_answer Viewer • Updated May 28, 2025 • 5.8k • 26