Skip to content

Instantly share code, notes, and snippets.

@EvilFreelancer
Created June 22, 2025 18:49
Show Gist options
  • Save EvilFreelancer/118b487b02c8cbb3bca99d84b0fdcd24 to your computer and use it in GitHub Desktop.
Save EvilFreelancer/118b487b02c8cbb3bca99d84b0fdcd24 to your computer and use it in GitHub Desktop.
Docker Compose with Ollama
x-shared-logs: &shared-logs
logging:
driver: "json-file"
options:
max-size: "10k"
services:
ollama:
image: ollama/ollama:0.9.2
restart: unless-stopped
volumes:
- ./ollama_data:/root
environment:
OLLAMA_ORIGINS: "*"
OLLAMA_KEEP_ALIVE: 60m
OLLAMA_FLASH_ATTENTION: 1
OLLAMA_MAX_LOADED_MODELS: 1
OLLAMA_MAX_QUEUE: 1
OLLAMA_NUM_PARALLEL: 10
OLLAMA_GPU_OVERHEAD: 0
OLLAMA_SCHED_SPREAD: 1
OLLAMA_KV_CACHE_TYPE: q4_0
ports:
- "11434:11434"
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: [ '1' ]
capabilities: [ gpu ]
<<: *shared-logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment