Better aligned models obtained by weak-to-strong model extrapolation (ExPO)
-
Weak-to-Strong Extrapolation Expedites Alignment
Paper • 2404.16792 • Published • 10 -
chujiezheng/Smaug-Llama-3-70B-Instruct-ExPO
Text Generation • Updated • 416 • 2 -
chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO
Text Generation • Updated • 262 • 7 -
chujiezheng/LLaMA3-iterative-DPO-final-ExPO
Text Generation • Updated • 429