Getting Started with OpenWeb-UI and Ollama using Docker Compose.

This gist will show how to get started with OpenWeb-UI and Ollama on docker compose and how to interact with the Web and API to communicate with the models hosted by Ollama.

This is done with only CPU.

Docker Compose

We define 2 services in our compose definition:

Ollama: we will run ollama inside a container and persist the models to disk.
OpenWebUI: we are pointing it to the ollama service and disabling CORS.

services:
  ollama:
    container_name: ollama
    image: ollama/ollama:latest
    restart: unless-stopped
    tty: true
    pull_policy: always
    volumes:
      - ollama:/root/.ollama

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - 3000:8080
    environment: # https://docs.openwebui.com/getting-started/env-configuration/
      - CORS_ALLOW_ORIGIN=*
      - OLLAMA_BASE_URL=http://ollama:11434
      - OLLAMA_API_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=
    depends_on:
      - ollama
    volumes:
      - open-webui:/app/backend/data
    extra_hosts:
      - host.docker.internal:host-gateway

volumes:
  ollama: {}
  open-webui: {}

Inside the directory where docker-compose.yaml is run:

docker compose up -d

This will take some time to start initially, you can view the logs using:

docker compose logs -f

And if you see Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) you can access openweb-ui on port 3000:

You will be dropped into the main screen:

But there are no models configured yet, we will do this now.

Download Models from Ollama

From your profile, select admin settings:

Then select "Settings" and then "Connections":

You should see your "Ollama" API Connection, then click the manage icon on the right and you should see this:

Now we would like to download the following models:

llama3.2
gemma:2b
mistral:7b

First download llama3.2 by typing it in and download it:

This will take some time, but after it finishes, download gemma:2b and mistral:7b as well.

You can view a list of available models and their descriptions here:

https://ollama.com/library?sort=popular

Once all models have been downloaded you can close that and move to "Settings" -> "Models", then you should see your models:

Test out OpenWeb-UI

Select your model on top, then ask a question to test if everything is working:

API Access

To create a API key, we need to go into the user account settings, where you can select the user at the bottom left, then select "Settings" and then "Account":

You can create a API key and then copy the value. For this demonstration I will be setting it to a environment variable in my terminal:

export OWU_APIKEY=sk-a1f7fxoxoxoxoxoxoxoxoxoad72

Then we can consult the openweb-ui documentation for their API Endpoints: https://docs.openwebui.com/getting-started/api-endpoints

Interact with OpenWeb-UI API

And a quick test is to use the "get models" api endpoint, to view all the models that we created:

curl -H "Authorization: Bearer $OWU_APIKEY" http://localhost:3000/api/models

And a filtered response will look like this:

{
  "data": [
    {
      "id": "mistral:7b",
      "name": "mistral:7b",
      "object": "model",
      "created": 1739805581,
      "owned_by": "ollama",
      "ollama": {
        "name": "mistral:7b",
        "model": "mistral:7b",
        "modified_at": "2025-02-17T15:07:12.854653452Z",
        "size": 4113301824,
        "digest": "f974a74358d62a017b37c6f424fcdf2744ca02926c4f952513ddf474b2fa5091",
        "details": {
          "parent_model": "",
          "format": "gguf",
          "family": "llama",
          "families": [
            "llama"
          ],
          "parameter_size": "7.2B",
          "quantization_level": "Q4_0"
        },
        "urls": [
          0
        ]
      },
      "actions": []
    },
    {
      "id": "gemma:2b",
      "name": "gemma:2b",
      "object": "model",
      "...": ""
    },
    {
      "id": "llama3.2:latest",
      "name": "llama3.2:latest",
      "object": "model",
      "...": ""
    },
    {
      "id": "arena-model",
      "name": "Arena Model",
      "...": ""
    }
  ]
}

Then we can also do a POST request to /api/chat/completions which is a chat completion endpoint for one of the provided models:

curl -s -XPOST \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $OWU_APIKEY" \
  http://localhost:3000/api/chat/completions \
  -d '
  {
    "model": "gemma:2b", 
    "messages": [
      {"role": "user", "content": "what is the capital of australia?"}
    ]
  }'

And the response:

{
  "id": "gemma:2b-61b3aed1-77cf-4e62-86d4-b058bf9af5fd",
  "created": 1739805878,
  "model": "gemma:2b",
  "choices": [
    {
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop",
      "message": {
        "content": "The capital of Australia is Canberra. It is the political, economic, and administrative center of Australia.",
        "role": "assistant"
      }
    }
  ],
  "object": "chat.completion",
  "usage": {
    "response_token/s": 10,
    "prompt_token/s": 39.94,
    "total_duration": 4722390078,
    "load_duration": 1892429681,
    "prompt_eval_count": 29,
    "prompt_eval_duration": 726000000,
    "eval_count": 21,
    "eval_duration": 2101000000,
    "approximate_total": "0h0m4s"
  }
}

Python Requests with API

>>> import requests
>>> headers = {"content-type": "application/json", "Authorization": "Bearer sk-a1f7fdxxxxxxxxxxxxxxxad72"}
>>> request_body = {"model":"gemma:2b", "messages": [{"role": "user", "content": "what is the capital of australia?"}]}
>>> response = requests.post("http://localhost:3001/api/chat/completions", headers=headers, json=request_body)

>>> response.status_code
200
>>> response.json()
{'id': 'gemma:2b-8aa600a9-4de5-423d-b906-25ce41693324', 'created': 1739823542, 'model': 'gemma:2b', 'choices': [{'index': 0, 'logprobs': None, 'finish_reason': 'stop', 'message': {'content': 'The capital of Australia is Canberra. It is a city in the Australian Capital Territory, which is a self-governing territory within the Commonwealth of Australia.', 'role': 'assistant'}}], 'object': 'chat.completion', 'usage': {'response_token/s': 9.62, 'prompt_token/s': 40.45, 'total_duration': 5660808687, 'load_duration': 1615716699, 'prompt_eval_count': 29, 'prompt_eval_duration': 717000000, 'eval_count': 32, 'eval_duration': 3325000000, 'approximate_total': '0h0m5s'}}

>>> response.json().get('choices')[0].get('message').get('content')
'The capital of Australia is Canberra. It is a city in the Australian Capital Territory, which is a self-governing territory within the Commonwealth of Australia.'

Ollama Container

We can use ollama cli commands within the container:

docker exec -it ollama sh -c 'ollama list'

Which will show:

NAME               ID              SIZE      MODIFIED
mistral:7b         f974a74358d6    4.1 GB    30 minutes ago
gemma:2b           b50d6c999e59    1.7 GB    43 minutes ago
llama3.2:latest    a80c4f17acd5    2.0 GB    47 minutes ago

ruanbekker/openwebui-ollama-docker-getting-started.md