This gist will show how to get started with OpenWeb-UI and Ollama on docker compose and how to interact with the Web and API to communicate with the models hosted by Ollama.
This is done with only CPU.
We define 2 services in our compose definition:
- Ollama: we will run ollama inside a container and persist the models to disk.
- OpenWebUI: we are pointing it to the ollama service and disabling CORS.
services:
ollama:
container_name: ollama
image: ollama/ollama:latest
restart: unless-stopped
tty: true
pull_policy: always
volumes:
- ollama:/root/.ollama
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
ports:
- 3000:8080
environment: # https://docs.openwebui.com/getting-started/env-configuration/
- CORS_ALLOW_ORIGIN=*
- OLLAMA_BASE_URL=http://ollama:11434
- OLLAMA_API_BASE_URL=http://ollama:11434
- WEBUI_SECRET_KEY=
depends_on:
- ollama
volumes:
- open-webui:/app/backend/data
extra_hosts:
- host.docker.internal:host-gateway
volumes:
ollama: {}
open-webui: {}
Inside the directory where docker-compose.yaml
is run:
docker compose up -d
This will take some time to start initially, you can view the logs using:
docker compose logs -f
And if you see Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
you can access openweb-ui on port 3000
:
You will be dropped into the main screen:
But there are no models configured yet, we will do this now.
From your profile, select admin settings:
Then select "Settings" and then "Connections":
You should see your "Ollama" API Connection, then click the manage icon on the right and you should see this:
Now we would like to download the following models:
llama3.2
gemma:2b
mistral:7b
First download llama3.2
by typing it in and download it:
This will take some time, but after it finishes, download gemma:2b
and mistral:7b
as well.
You can view a list of available models and their descriptions here:
Once all models have been downloaded you can close that and move to "Settings" -> "Models", then you should see your models:
Select your model on top, then ask a question to test if everything is working:
To create a API key, we need to go into the user account settings, where you can select the user at the bottom left, then select "Settings" and then "Account":
You can create a API key and then copy the value. For this demonstration I will be setting it to a environment variable in my terminal:
export OWU_APIKEY=sk-a1f7fxoxoxoxoxoxoxoxoxoad72
Then we can consult the openweb-ui documentation for their API Endpoints: https://docs.openwebui.com/getting-started/api-endpoints
And a quick test is to use the "get models" api endpoint, to view all the models that we created:
curl -H "Authorization: Bearer $OWU_APIKEY" http://localhost:3000/api/models
And a filtered response will look like this:
{
"data": [
{
"id": "mistral:7b",
"name": "mistral:7b",
"object": "model",
"created": 1739805581,
"owned_by": "ollama",
"ollama": {
"name": "mistral:7b",
"model": "mistral:7b",
"modified_at": "2025-02-17T15:07:12.854653452Z",
"size": 4113301824,
"digest": "f974a74358d62a017b37c6f424fcdf2744ca02926c4f952513ddf474b2fa5091",
"details": {
"parent_model": "",
"format": "gguf",
"family": "llama",
"families": [
"llama"
],
"parameter_size": "7.2B",
"quantization_level": "Q4_0"
},
"urls": [
0
]
},
"actions": []
},
{
"id": "gemma:2b",
"name": "gemma:2b",
"object": "model",
"...": ""
},
{
"id": "llama3.2:latest",
"name": "llama3.2:latest",
"object": "model",
"...": ""
},
{
"id": "arena-model",
"name": "Arena Model",
"...": ""
}
]
}
Then we can also do a POST
request to /api/chat/completions
which is a chat completion endpoint for one of the provided models:
curl -s -XPOST \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $OWU_APIKEY" \
http://localhost:3000/api/chat/completions \
-d '
{
"model": "gemma:2b",
"messages": [
{"role": "user", "content": "what is the capital of australia?"}
]
}'
And the response:
{
"id": "gemma:2b-61b3aed1-77cf-4e62-86d4-b058bf9af5fd",
"created": 1739805878,
"model": "gemma:2b",
"choices": [
{
"index": 0,
"logprobs": null,
"finish_reason": "stop",
"message": {
"content": "The capital of Australia is Canberra. It is the political, economic, and administrative center of Australia.",
"role": "assistant"
}
}
],
"object": "chat.completion",
"usage": {
"response_token/s": 10,
"prompt_token/s": 39.94,
"total_duration": 4722390078,
"load_duration": 1892429681,
"prompt_eval_count": 29,
"prompt_eval_duration": 726000000,
"eval_count": 21,
"eval_duration": 2101000000,
"approximate_total": "0h0m4s"
}
}
>>> import requests
>>> headers = {"content-type": "application/json", "Authorization": "Bearer sk-a1f7fdxxxxxxxxxxxxxxxad72"}
>>> request_body = {"model":"gemma:2b", "messages": [{"role": "user", "content": "what is the capital of australia?"}]}
>>> response = requests.post("http://localhost:3001/api/chat/completions", headers=headers, json=request_body)
>>> response.status_code
200
>>> response.json()
{'id': 'gemma:2b-8aa600a9-4de5-423d-b906-25ce41693324', 'created': 1739823542, 'model': 'gemma:2b', 'choices': [{'index': 0, 'logprobs': None, 'finish_reason': 'stop', 'message': {'content': 'The capital of Australia is Canberra. It is a city in the Australian Capital Territory, which is a self-governing territory within the Commonwealth of Australia.', 'role': 'assistant'}}], 'object': 'chat.completion', 'usage': {'response_token/s': 9.62, 'prompt_token/s': 40.45, 'total_duration': 5660808687, 'load_duration': 1615716699, 'prompt_eval_count': 29, 'prompt_eval_duration': 717000000, 'eval_count': 32, 'eval_duration': 3325000000, 'approximate_total': '0h0m5s'}}
>>> response.json().get('choices')[0].get('message').get('content')
'The capital of Australia is Canberra. It is a city in the Australian Capital Territory, which is a self-governing territory within the Commonwealth of Australia.'
We can use ollama
cli commands within the container:
docker exec -it ollama sh -c 'ollama list'
Which will show:
NAME ID SIZE MODIFIED
mistral:7b f974a74358d6 4.1 GB 30 minutes ago
gemma:2b b50d6c999e59 1.7 GB 43 minutes ago
llama3.2:latest a80c4f17acd5 2.0 GB 47 minutes ago