Omar Sanseviero's picture

Omar Sanseviero

osanseviero

·

https://osanseviero.github.io/hackerllama/

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Articles

Welcome Llama 3 - Meta's new open LLM

CodeGemma - an official Google release for code LLMs

🪆 Introduction to Matryoshka Embedding Models

Welcome Gemma - Google's new open LLM

Constitutional AI with Open LLMs

Preference Tuning LLMs with Direct Preference Optimization Methods

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Mixture of Experts Explained

Inference for PROs

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Results of the Open Source AI Game Jam

Llama 2 is here - get it on Hugging Face

The Falcon has landed in the Hugging Face ecosystem

Hugging Face Machine Learning Demos on arXiv

What's new in Diffusers? 🎨

Announcing Evaluation on the Hub

An Introduction to Deep Reinforcement Learning

Welcome spaCy to the 🤗 Hub

Sentence Transformers in the 🤗 Hub

Organizations

osanseviero's activity

upvoted 6 papers about 1 hour ago

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published 5 days ago • 8

Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

Paper • 2405.14908 • Published 4 days ago • 7

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Paper • 2405.14979 • Published 4 days ago • 9

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published 3 days ago • 9

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Paper • 2405.15071 • Published 4 days ago • 19

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published 3 days ago • 28

upvoted a collection about 1 hour ago

ConvLLaVA

A coolection of ConvLLaVA models. • 9 items • Updated 2 days ago • 2

upvoted a paper about 1 hour ago

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published 3 days ago • 32

upvoted 2 papers about 9 hours ago

Aya 23: Open Weight Releases to Further Multilingual Progress

Paper • 2405.15032 • Published 4 days ago • 11

The Road Less Scheduled

Paper • 2405.15682 • Published 3 days ago • 10

upvoted an article about 10 hours ago

Article

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

By

•

2 days ago

• 8

upvoted an article about 11 hours ago

Article

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

4 days ago

• 8

upvoted a paper 1 day ago

Transformers Can Represent n-gram Language Models

Paper • 2404.14994 • Published Apr 23 • 18

upvoted a paper 3 days ago

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Paper • 2405.11273 • Published 9 days ago • 13

upvoted an article 3 days ago

Article

AI has a problem with objectifying women

By

•

3 days ago

• 41

upvoted a collection 4 days ago

C4AI Aya 23

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 4 days ago • 29

upvoted a paper 5 days ago

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Paper • 2404.16019 • Published Apr 24 • 1

upvoted an article 5 days ago

Article

Introducing Spaces Dev Mode for a seamless developer experience

7 days ago

• 7

upvoted a collection 5 days ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 20 items • Updated 5 days ago • 286

upvoted 8 papers 5 days ago

Diffusion for World Modeling: Visual Details Matter in Atari

Paper • 2405.12399 • Published 7 days ago • 25

Personalized Residuals for Concept-Driven Text-to-Image Generation

Paper • 2405.12978 • Published 6 days ago • 8

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Paper • 2405.12970 • Published 6 days ago • 20

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

Paper • 2405.12979 • Published 6 days ago • 7

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published 6 days ago • 22

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published 8 days ago • 116

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Paper • 2405.11582 • Published 8 days ago • 10

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

Paper • 2405.11252 • Published 9 days ago • 11

upvoted 3 papers 6 days ago

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Paper • 2405.11157 • Published 10 days ago • 22

Octo: An Open-Source Generalist Robot Policy

Paper • 2405.12213 • Published 7 days ago • 22

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published 8 days ago • 31

upvoted a collection 6 days ago

Imp-v1.5

A series of Imp models with different LLM backbone. • 5 items • Updated 6 days ago • 3

upvoted 3 papers 6 days ago

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published 7 days ago • 23

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published 7 days ago • 38

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published 8 days ago • 48

upvoted an article 6 days ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 48

upvoted 2 papers 7 days ago

Grounded 3D-LLM with Referent Tokens

Paper • 2405.10370 • Published 11 days ago • 7

INDUS: Effective and Efficient Language Models for Scientific Applications

Paper • 2405.10725 • Published 10 days ago • 19

upvoted an article 7 days ago

Article

Enjoy the Power of Phi-3 with ONNX Runtime on your device

By

•

6 days ago

• 18

upvoted a paper 7 days ago

Observational Scaling Laws and the Predictability of Language Model Performance

Paper • 2405.10938 • Published 10 days ago • 10

upvoted a paper 8 days ago

Toon3D: Seeing Cartoons from a New Perspective

Paper • 2405.10320 • Published 11 days ago • 19

upvoted an article 8 days ago

Article

Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI.

By

•

6 days ago

• 22

upvoted an article 9 days ago

Article

What is going on with AlphaFold3?

By

•

6 days ago

• 8

upvoted 7 papers 10 days ago

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

Paper • 2404.01094 • Published Apr 1 • 4

Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion

Paper • 2405.09874 • Published 11 days ago • 14

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published 11 days ago • 22

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Paper • 2405.10314 • Published 11 days ago • 37

Many-Shot In-Context Learning in Multimodal Foundation Models

Paper • 2405.09798 • Published 12 days ago • 24

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published 12 days ago • 70

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published 12 days ago • 92

upvoted 2 articles 11 days ago

Article

Train custom AI models with the trainer API and adapt them to 🤗

By

•

2 days ago

• 18

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

By

•

11 days ago

• 16

upvoted 3 papers 11 days ago

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Paper • 2405.09546 • Published 12 days ago • 9

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Paper • 2405.09215 • Published 12 days ago • 14

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

Paper • 2405.09220 • Published 12 days ago • 22

upvoted 2 articles 11 days ago

Article

2024-04-22 - Hub Incident Post Mortem

By

•

10 days ago

• 15

Article

Hugging Face + Google Visual Blocks

By

•

11 days ago

• 17

upvoted 4 papers 12 days ago

SpeechVerse: A Large-scale Generalizable Audio Language Model

Paper • 2405.08295 • Published 14 days ago • 10

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Paper • 2405.08317 • Published 14 days ago • 8

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published 13 days ago • 11

No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding

Paper • 2405.08344 • Published 13 days ago • 10