Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 Paper • 2405.00664 • Published May 1 • 18
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 104
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Paper • 2405.08707 • Published 30 days ago • 27
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 29 days ago • 14
Bee Models 🍯 Collection models fine-tuned to be knowledgeable about apiary practice • 6 items • Updated Apr 29 • 1
Arctic-embed Collection A collection of text embedding models optimized for retrieval accuracy and efficiency • 5 items • Updated Apr 17 • 11
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 583
ScreenAI: A Vision-Language Model for UI and Infographics Understanding Paper • 2402.04615 • Published Feb 7 • 32
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 7 days ago • 193
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 81
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Paper • 2401.01325 • Published Jan 2 • 25
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 55
Nougat ONNX Collection Faster Nougat in ONNX format (optimum onnxruntime) • 6 items • Updated Feb 24 • 1
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure Paper • 2311.07590 • Published Nov 9, 2023 • 15
smol llama Collection 🚧"raw" pretrained smol_llama checkpoints - WIP 🚧 • 4 items • Updated Apr 29 • 5