RLHF-And-Friends
community
AI & ML interests
None defined yet.
models
9
RLHF-And-Friends/Llama-3.2-3B-Instruct-DPO-Math
Text Generation
•
Updated
•
469
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math-SF
Text Generation
•
Updated
•
8
RLHF-And-Friends/Llama-3.2-3B-Instruct
Text Generation
•
Updated
•
562
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math
Updated
•
50
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit
Updated
•
51
RLHF-And-Friends/Llama3.1-8B
Updated
•
30
RLHF-And-Friends/Llama3.1-8B-DPO-0.05
Updated
•
53
RLHF-And-Friends/Zephyr-7B-DPO-0.05
Updated
•
51
RLHF-And-Friends/Zephyr-SFT-7B
Updated
•
45
datasets
None public yet