AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published 5 days ago • 8
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining Paper • 2405.14908 • Published 4 days ago • 7
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner Paper • 2405.14979 • Published 4 days ago • 9
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published 3 days ago • 9
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published 4 days ago • 19
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Paper • 2405.15574 • Published 3 days ago • 28
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published 3 days ago • 32
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published 4 days ago • 11
view article Article GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing By NicoNico • 2 days ago • 8
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages 4 days ago • 8
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts Paper • 2405.11273 • Published 9 days ago • 13
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 4 days ago • 29
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Paper • 2404.16019 • Published Apr 24 • 1
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 20 items • Updated 5 days ago • 286
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 7 days ago • 25
Personalized Residuals for Concept-Driven Text-to-Image Generation Paper • 2405.12978 • Published 6 days ago • 8
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published 6 days ago • 20
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Paper • 2405.12979 • Published 6 days ago • 7
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published 6 days ago • 22
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published 8 days ago • 10
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching Paper • 2405.11252 • Published 9 days ago • 11
Towards Modular LLMs by Building and Reusing a Library of LoRAs Paper • 2405.11157 • Published 10 days ago • 22
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published 8 days ago • 31
Imp-v1.5 Collection A series of Imp models with different LLM backbone. • 5 items • Updated 6 days ago • 3
Imp: Highly Capable Large Multimodal Models for Mobile Devices Paper • 2405.12107 • Published 7 days ago • 23
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 7 days ago • 38
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published 8 days ago • 48
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • Apr 24 • 48
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published 10 days ago • 19
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • 6 days ago • 18
Observational Scaling Laws and the Predictability of Language Model Performance Paper • 2405.10938 • Published 10 days ago • 10
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish • 6 days ago • 22
HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach Paper • 2404.01094 • Published Apr 1 • 4
Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Paper • 2405.09874 • Published 11 days ago • 14
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Paper • 2405.10300 • Published 11 days ago • 22
CAT3D: Create Anything in 3D with Multi-View Diffusion Models Paper • 2405.10314 • Published 11 days ago • 37
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published 12 days ago • 24
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published 12 days ago • 92
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • 2 days ago • 18
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • 11 days ago • 16
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Paper • 2405.09546 • Published 12 days ago • 9
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper • 2405.09215 • Published 12 days ago • 14
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published 12 days ago • 22
SpeechVerse: A Large-scale Generalizable Audio Language Model Paper • 2405.08295 • Published 14 days ago • 10
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Paper • 2405.08317 • Published 14 days ago • 8
Understanding the performance gap between online and offline alignment algorithms Paper • 2405.08448 • Published 13 days ago • 11
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Paper • 2405.08344 • Published 13 days ago • 10