Using Inference Providers as Backend

Lighteval allows you to use Hugging Face’s Inference Providers to evaluate LLMs on supported providers such as Black Forest Labs, Cerebras, Fireworks AI, Nebius, Together AI, and many more.

Do not forget to set your Hugging Face API key. You can set it using the HF_TOKEN environment variable or by using the huggingface-cli command.

Basic Usage

lighteval endpoint inference-providers \
    "model_name=deepseek-ai/DeepSeek-R1,provider=hf-inference" \
    "lighteval|gsm8k|0"

Using a Configuration File

You can use configuration files to define the model and the provider to use.

lighteval endpoint inference-providers \
    examples/model_configs/inference_providers.yaml \
    "lighteval|gsm8k|0"

With the following configuration file:

model_parameters:
  model_name: "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
  provider: "novita"
  timeout: null
  proxies: null
  parallel_calls_count: 10
  generation_parameters:
    temperature: 0.8
    top_k: 10
    max_new_tokens: 10000

By default, inference requests are billed to your personal account. Optionally, you can charge them to an organization by setting org_to_bill="<your_org_name>" (requires being a member of that organization).

Supported Providers

Hugging Face Inference Providers supports a wide range of LLM providers see the Inference Providers documentation for the complete list.

Billing and Costs

Personal Account Billing

By default, all inference requests are billed to your personal Hugging Face account. You can monitor your usage in the Hugging Face billing dashboard.

Organization Billing

To bill requests to an organization:

Ensure you are a member of the organization
Add org_to_bill="<organization_name>" to your configuration
The organization must have sufficient credits

model_parameters:
  model_name: "meta-llama/Llama-2-7b-chat-hf"
  provider: "together"
  org_to_bill: "my-organization"

For more detailed error handling and provider-specific information, refer to the Hugging Face Inference Providers documentation.

< > Update on GitHub