mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL By driaforall and 1 other • 14 days ago • 21
SyGra: The One-Stop Framework for Building Data for LLMs and SLMs By ServiceNow-AI and 3 others • 3 days ago • 9
Qianfan-VL: A Milestone Achievement in Chinese Multimodal AI with Domestic Chips By baidu • about 15 hours ago • 7
AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models By imomayiz and 4 others • 9 days ago • 14
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 221
Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! By Writer and 1 other • 13 days ago • 58
🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎 By sasha and 1 other • 8 days ago • 10
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 70
🥬 TinyLettuce: Efficient Hallucination Detection with 17–68M Encoders By adaamko and 1 other • 25 days ago • 13
mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL By driaforall and 1 other • 14 days ago • 21
SyGra: The One-Stop Framework for Building Data for LLMs and SLMs By ServiceNow-AI and 3 others • 3 days ago • 9
Qianfan-VL: A Milestone Achievement in Chinese Multimodal AI with Domestic Chips By baidu • about 15 hours ago • 7
AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models By imomayiz and 4 others • 9 days ago • 14
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 221
Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! By Writer and 1 other • 13 days ago • 58
🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎 By sasha and 1 other • 8 days ago • 10
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 70
🥬 TinyLettuce: Efficient Hallucination Detection with 17–68M Encoders By adaamko and 1 other • 25 days ago • 13