RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.0-step-675 Text Generation • 7B • Updated about 9 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.0-step-300 Text Generation • 7B • Updated about 9 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.25-step-675 Text Generation • 7B • Updated about 18 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.25-step-520 Text Generation • 7B • Updated about 18 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-1.0-step-500 Text Generation • 7B • Updated 2 days ago • 138