Skip to content

Instantly share code, notes, and snippets.

@glorat
Created April 10, 2025 14:29
Show Gist options
  • Save glorat/bb7613743ade7c56dddcc2063e0a028c to your computer and use it in GitHub Desktop.
Save glorat/bb7613743ade7c56dddcc2063e0a028c to your computer and use it in GitHub Desktop.
Litellm for ollama
  1. Running It

Install dependencies:

pip install -r requirements.txt

Run LiteLLM:

python run_litellm.py

or

litellm --config litellm_config.yaml --host 0.0.0.0 --port 4000 proxy

  1. Accessing from Other Devices on Network

Make sure: • Your firewall allows port 4000. • You access LiteLLM via your computer’s local IP address, e.g., http://192.168.1.123:4000/v1/chat/completions.

  1. Test API Call (Curl Example)

curl http://192.168.1.123:4000/v1/chat/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer my-secret-litellm-key"
-d '{ "model": "ollama-llama3", "messages": [{"role": "user", "content": "Hello"}] }'

making sure to use the API key in he bearer header

model_list:
- model_name: ollama-deepseek # the name to address when calling LiteLLM
litellm_params:
model: ollama/deekseek # must match model name running on the ollama server!!
api_base: http://localhost:11434
api_key: "ollama" # Ollama doesn't need this, but LiteLLM expects something
adapter: ollama
litellm_settings:
# This sets API key auth for LiteLLM's proxy
general_settings:
api_keys: # List of valid API keys clients must use
- "my-secret-litellm-key"
litellm[proxy]
uvicorn
import subprocess
# Start LiteLLM Proxy with specified config and listen on all interfaces
subprocess.run([
"litellm",
"--config", "litellm_config.yaml",
"--host", "0.0.0.0",
"--port", "4000",
"proxy"
])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment