Ranger Loh
carrobot
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
4 days ago
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
upvoted
a
paper
4 months ago
Through the Valley: Path to Effective Long CoT Training for Small
Language Models
commented on
a paper
over 1 year ago
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
Reasoning
Organizations
None yet