koutch/short_paper_llama_llama3.1-8b_train_sft_train_think Text Generation • 8B • Updated 1 day ago • 23
koutch/short_paper_llama_llama3.1-8b_train_sft_train_think Text Generation • 8B • Updated 1 day ago • 23
koutch/short_paper_llama_llama3.1-8b_train_sft_train_no_think Text Generation • 8B • Updated 1 day ago • 73
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated 1 day ago • 20
koutch/short_paper_smol_smol3-3B_train_sft_train_no_think Text Generation • 3B • Updated 1 day ago • 80
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_no_think Text Generation • 4B • Updated 1 day ago • 78
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated 1 day ago • 20
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train Text Generation • 4B • Updated 1 day ago • 26
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train Text Generation • 4B • Updated 1 day ago • 26
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train Text Generation • 4B • Updated 1 day ago • 26
koutch/short_paper_llama_llama3.1-8b_train_sft_train_no_think Text Generation • 8B • Updated 1 day ago • 73
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_no_think Text Generation • 4B • Updated 1 day ago • 78
koutch/short_paper_smol_smol3-3B_train_sft_train_no_think Text Generation • 3B • Updated 1 day ago • 80