Tim Tang
timxiaohangt
AI & ML interests
Reinforcement Learning, Game Theory
Recent Activity
updated
a model
4 days ago
diffusion-reasoning/LLaDA-8B-Instruct-MDPO-math
published
a model
4 days ago
diffusion-reasoning/LLaDA-8B-Instruct-MDPO-math
updated
a model
about 1 month ago
RegularizedSelfPlay/Llama-3-8B-Instruct-SPPO-Iter2-gp-8b-gpm-reg0.5-sppo-reversekl-table