Lighteval documentation
Using LiteLLM as Backend
Using LiteLLM as Backend
Lighteval allows you to use LiteLLM as a backend, enabling you to call all LLM APIs using the OpenAI format. LiteLLM supports various providers including Bedrock, Hugging Face, Vertex AI, Together AI, Azure, OpenAI, Groq, and many others.
Documentation for available APIs and compatible endpoints can be found here.
Basic Usage
lighteval endpoint litellm \
"provider=openai,model_name=gpt-3.5-turbo" \
"lighteval|gsm8k|0"
Using a Configuration File
LiteLLM allows generation with any OpenAI-compatible endpoint. For example, you can evaluate a model running on a local VLLM server.
To do so, you will need to use a configuration file like this:
model_parameters:
model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
base_url: "URL_OF_THE_ENDPOINT_YOU_WANT_TO_USE"
api_key: "" # Remove or keep empty as needed
generation_parameters:
temperature: 0.5
max_new_tokens: 256
stop_tokens: [""]
top_p: 0.9
seed: 0
repetition_penalty: 1.0
frequency_penalty: 0.0
Supported Providers
LiteLLM supports a wide range of LLM providers:
Cloud Providers
all cloud providers can be found in the litellm documentation.
Local/On-Premise
- VLLM: Local VLLM servers
- Hugging Face: Local Hugging Face models
- Custom endpoints: Any OpenAI-compatible API
Using with Local Models
VLLM Server
To use with a local VLLM server:
- Start your VLLM server:
vllm serve HuggingFaceH4/zephyr-7b-beta --host 0.0.0.0 --port 8000
- Configure LiteLLM to use the local server:
model_parameters:
provider: "openai"
model_name: "HuggingFaceH4/zephyr-7b-beta"
base_url: "http://localhost:8000/v1"
api_key: ""
For more detailed error handling and debugging, refer to the LiteLLM documentation.
< > Update on GitHub