Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

PatrickHaller

posted an update 3 minutes ago

Post

How Robust Is Your Model in Complex Code Generation Tasks? 🤔

We've launched the PECC benchmark to challenge chat models in code generation, drawing from the Advent of Code for programming tasks and the Euler Project for math-heavy challenges. This new task tests models with problems presented in both detailed prose and concise "leet code" styles, evaluating their ability to understand and solve complex coding issues and math problem in chat-based interactions.

It seems that the Claude 3 models outperforme ChatGPT:
Model / Avg. (pass@3)
Claude 3 Haiku / 27.67
GPT-3.5-Turbo / 23.75
Mixtral-8x22B-Instruct-v0.1 / 8.35

Read our Preprint📃: PECC: Problem Extraction and Coding Challenges (2404.18766)
Look at the dataset🔎: PatrickHaller/pecc

We also got accepted at LREC-COLING '24 🎉

KingNish

posted an update 10 minutes ago

Post

Introducing JARVIS Tony's voice assistant for You.

JARVIS responds to all your questions in audio format.
Must TRY -> KingNish/JARVIS

Jarvis is currently equipped to accept text input and provide audio output.
In the future, it may also support audio input.

DEMO Video:

radames

posted an update about 4 hours ago

Post

429

I've built a custom component that integrates Rerun web viewer with Gradio, making it easier to share your demos as Gradio apps.

Basic snippet

# pip install gradio_rerun gradio
import gradio as gr
from gradio_rerun import Rerun

gr.Interface(
    inputs=gr.File(file_count="multiple", type="filepath"),
    outputs=Rerun(height=900),
    fn=lambda file_path: file_path,
).launch()

More details here radames/gradio_rerun
Source https://github.com/radames/gradio-rerun-viewer

Follow Rerun here https://huggingface.co/rerun

m-ric

posted an update about 15 hours ago

Post

1260

💰❌ 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐯𝐞𝐫𝐲 𝐆𝐏𝐔 𝐏𝐨𝐨𝐫 - 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐥𝐚𝐰𝐬 𝐫𝐞𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧

🎆 Good news: 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗱𝗼 𝗰𝘂𝘁𝘁𝗶𝗻𝗴-𝗲𝗱𝗴𝗲 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝘄𝗶𝘁𝗵 𝗮 𝗰𝗮𝗹𝗰𝘂𝗹𝗮𝘁𝗼𝗿 𝗮𝗻𝗱 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗣𝗮𝗶𝗻𝘁 𝟮𝟬𝟬𝟲!

The Chinchilla experiments (by Google DeepMind) ran hundreds of pre-trainings with models >1B parameters (I do not want to imagine how much that cost) to 𝗳𝗶𝗻𝗱 𝘁𝗵𝗲 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗿𝗮𝘁𝗶𝗼 𝗼𝗳 𝗺𝗼𝗱𝗲𝗹 𝘀𝗶𝘇𝗲 𝘃𝘀 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘁𝗼𝗸𝗲𝗻𝘀. Why is this question so important?
Well, you only ever have access to a fixed compute, counted in FLOPs (floating point operations). So if your model is bigger, you will have less compute to train on many tokens, and if you want to train on more tokens, your model will be smaller. When model trainings cost million, you absolutely need to get this right.

The new paper "Chinchilla Scaling: A replication attempt" by Epoch AI sets on on the ambitious goal of reproducing this.

But since the authors do not have infinite money, they decided to directly run their computations from DeepMind's own experiments! They took the figure from the last experiment (cf slide below), measured point positions, picked color codes, and ended up reconstructing the underlying data.

💥 They then just fit the scaling laws proposed by the Chinchilla Authors, but arrived at wildly different results! They find that as a rough rule of thumb, you should use 20 training tokens for each parameter in your model, instead of the 70 obtained in the original paper. They also point out inconsistencies in the paper, and unrealistically narrow confidence intervals.

➡️ This only contradicts the results from the last (out of 3) experiments in the Chinchilla paper. And the model trained at the end of the Chinchilla paper still seems properly scaled.

✅ But it does show that a tiny bit more theoretical work can go a long way, especially given the huge financial costs that such an error can have!

joaogante

posted an update about 16 hours ago

Post

1021

Adding a long prompt can help you fight LLM hallucinations. However, if you know exactly how you want your LLM output constrained, there are much better strategies! 💪

Did you know you can force your LLM to ALWAYS generate a valid JSON file? Or to follow a well-defined answer template? You can do that and more with the 🤗 transformers-compatible outlines library.

It doesn't only allow you to master your LLM -- your text generation application will also become faster! 🔥 The more constrained your text generation is, the bigger speedups you'll see!

Follow @remi and other outlines folks to stay on top of the constrained generation game 🧠

victor

posted an update about 16 hours ago

Post

893

The hype is real: a mysterious gpt2-chatbot model has appeared on the LLM Arena Leaderboard 👀.
It seems to be at least on par with the top performing models (closed and open).

To try it out: https://chat.lmsys.org/ -> then click on the Direct Chat tab and select gpt2-chatbot.

Take your bet, what do you think it is?

2 replies

fdaudens

posted an update about 17 hours ago

Post

761

Should media organizations strike deals with big tech companies? Here are two colliding news stories about licensing:

1. The Financial Times has secured a licensing agreement with OpenAI to license its material both for training and queries on ChatGPT. It is the fifth such deal, following similar agreements with Associated Press, Axel Springer, Le Monde and Prisa Media. "Financial terms were not disclosed."

"Apart from the benefits to the FT, there are broader implications for the industry. It’s right, of course, that AI platforms pay publishers for the use of their material. OpenAI understands the importance of transparency, attribution, and compensation – all essential for us."

2. Meanwhile, French media outlet Mediapart is refusing to cash in money from Google, which it is entitled to under so-called "neighbouring rights" for the right to display their news content online.

Why? Due to issues with disclosing financial terms: "The confidentiality clauses imposed by Google today prevent us from publicizing to our readers not only the total amount paid, but also the amount Mediapart is entitled to receive."

"In our view, financial dependence on platforms is incompatible with our public service mission, which is to make the powerful face up to their responsibilities. It also seems extremely dangerous economically."

Two positions at opposite sides of the spectrum.

- The Financial Times and OpenAI strike content licensing deal
https://www.ft.com/content/33328743-ba3b-470f-a2e3-f41c3a366613

- Droits voisins : Mediapart lance la bataille de la transparence contre Google (in French) https

victor

posted an update about 19 hours ago

Post

918

Am I the only one who think command-r-+ is a better daily Assistant than ChatGPT-4? (and it's not even close :D)

1 reply

DavidVivancos

posted an update about 20 hours ago

Post

849

#ICLR 2024 is almost there 🔥🔥🔥 counting the days to be again in the beautiful city of Vienna participating in the The Twelfth International Conference on Learning Representations, hope to see many of the Hugging Face comunity there!

I would like to contribute 🎁 by releasing the second Knowledge Vault, with 100 lectures visualized from the last 10 years of ICLR from 2014 to 2023, including knowledge graphs for all the Invited Lectures and some extras, with almost 3000 topics represented. (Of course using several AI tools including Llama3)

You can explore it here:
🌏 https://theendofknowledge.com/Vaults/2/ICLR2014-2023.html

And you can learn more about the Vaults here:
📝https://www.linkedin.com/pulse/knowledge-vaults-david-vivancos-lbjef/

Hope you like the Knowledge Vault!

victor

posted an update about 21 hours ago

Post

755

If you are into video generation you should check this great Space by @KingNish - generate videos in ~10 seconds!

KingNish/Instant-Video

Recently active users