Part 4: Containerizing AgentOS — Build, Customize, and Deploy Your Self‑Learning Agent

Draft version

"Make it reproducible, or it never happened."

Prerequisites:

Introduction
Why Docker Matters for AgentOS
Prerequisites
Project Layout: The Files You'll Own
Step 1 — Define Your Agent(s) in agents.yaml
Step 2 — Write the Dockerfile
Step 3 — Build the Image Locally
Step 4 — Prepare Your Data Directory
Step 5 — Run Your Agent Container
Step 6 — Feed Tasks and Watch Your Agent Learn
Model‑Agnostic Design
Customization Patterns
Observing Your Agent's Growth
Next Steps
Complete Examples
Closing Thoughts

Introduction

In Parts 0–3 we built a complete blueprint for an agent that earns its skills, prunes its memory, and records its own history — all through plain files and git commits. But a blueprint is only half the story. To run an agent reliably, across machines, without hidden dependencies or "works on my laptop" surprises, we need a container.

This article, Part 4, turns the blueprint into a concrete, runnable system. By the end you'll have:

A Dockerfile that packages the AgentOS runtime — model‑agnostic and vendor‑neutral.
A declarative way to define your own agents (personas, domains, goals, LLM provider).
A local image build workflow you control — no external registry required.
A single command to launch an agent that learns from real tasks, with proven end‑to‑end flow.
Three complete, diverse examples: a code reviewer, a legal‑document analyst, and a creative writing coach.

No vendor lock‑in. No assumptions about which LLM you use. Just files, Python, git, and a Dockerfile.

Why Docker Matters for AgentOS

Concern	Without Docker	With Docker
Reproducibility	"But it worked in my venv…"	Image built from a locked‑in definition
Isolation	Agents share host filesystem, tools, secrets	Each container has its own filesystem and network
Portability	pip, Python version, OS quirks	`docker run` anywhere
Observability	Logs in disparate places	`docker logs`, mounted volumes
LLM provider swap	Rewrite integration code per machine	Provider config in environment variables — image unchanged

AgentOS already treats agent state as a filesystem contract. Docker adds an execution contract — the runtime environment is as deterministic as the state layout.

Prerequisites

Docker installed (≥ 20.10 recommended). Verify with docker --version.
An LLM API key — any provider. AgentOS works with OpenAI, Anthropic, local models via Ollama, or any HTTP‑accessible LLM.
git installed on your host (used to inspect agent history from outside the container, optional but useful).
The source files listed below.

Project Layout: The Files You'll Own

Create a project directory. You'll place six files inside:

my-agent-project/
├── Dockerfile
├── agents.yaml           # ← you define your agents here
├── agent.py              # model‑agnostic agent runtime
├── bootstrap.py          # filesystem initializer (idempotent)
├── main.py               # container entrypoint
└── requirements.txt      # httpx, pyyaml

The four Python files implement the entire runtime. Their full source was shown in the previous section — here's a quick summary of what each does:

File	Purpose
`requirements.txt`	Only two dependencies: `httpx>=0.28.1` and `pyyaml>=6.0.3`. No vendor SDKs.
`bootstrap.py`	Creates the standard file tree (`persona.md`, `constraints.md`, `skills.md`, `goals.md`, `rewards.md`, `reflections.md`, `queue.md`, `system_prompt.md`) and initializes a git repo. Idempotent — safe to call on every start.
`agent.py`	Reads tasks from `queue.md`, builds prompts from agent state files, calls any OpenAI‑compatible endpoint via `httpx`, parses structured responses, updates skills/goals/rewards, commits to git.
`main.py`	Entrypoint. Reads `agents.yaml`, bootstraps each agent, then loops forever checking each agent's queue for new tasks.

Step 1 — Define Your Agent(s) in `agents.yaml`

AgentOS discovers agents through a single YAML file mounted into the container. Every agent you want to run gets an entry here.

agents.yaml:

agents:
  - name: orion
    persona: "Backend engineer specializing in Python, PostgreSQL, and API design. Terse, technical, shows its work. Prioritizes correctness over cleverness."
    domain: "Backend engineering"
    tone: "Terse, technical, no preamble."
    hints:
      - "Tasks often involve parsing structured API data with pagination."
      - "Common failure: missing auth context or rate limiting."
      - "Quick win: reusable retry wrapper with exponential backoff."
    goals:
      - "Handle streaming SSE endpoints robustly"
      - "Build a query‑parameterisation skill for all SQL tasks"

What each field does:

Field	Purpose	Example
`name`	Unique agent identifier (used for its subdirectory)	`orion`
`persona`	Identity and mandate, written into `persona.md`	`Backend engineer…`
`domain`	Area of expertise	`Backend engineering`
`tone`	Communication style	`Terse, technical`
`hints`	Seed pointers — not skills, just areas to explore	`Reusable retry wrapper…`
`goals`	Optional; seed goals. Real goals emerge from failures.	`Handle streaming SSE endpoints`

Provider configuration goes in environment variables, not in YAML. This keeps the agent definition portable across providers.

Step 2 — Write the Dockerfile

Dockerfile:

FROM python:3.12-slim

LABEL org.opencontainers.image.title="AgentOS"
LABEL org.opencontainers.image.description="Self-improving LLM agent runtime — model‑agnostic"

# ── System dependencies ─────────────────────────────────────────
RUN apt-get update && \
    apt-get install -y --no-install-recommends git && \
    rm -rf /var/lib/apt/lists/*

# ── Create non‑root user ─────────────────────────────────────────
RUN useradd --create-home --shell /bin/bash agentos

# ── Application directory ────────────────────────────────────────
WORKDIR /app

# ── Python dependencies ──────────────────────────────────────────
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# ── Copy runtime files ───────────────────────────────────────────
COPY main.py bootstrap.py agent.py ./

# ── Data volume mount point ──────────────────────────────────────
RUN mkdir -p /data && chown -R agentos:agentos /data /app
VOLUME /data

# ── Runtime configuration ────────────────────────────────────────
USER agentos
ENV DATA_DIR=/data
ENV POLL_INTERVAL=5

# LLM defaults (override at runtime) — OpenAI‑compatible
ENV LLM_BASE_URL=https://api.openai.com/v1
ENV LLM_MODEL=gpt-4o
ENV LLM_MAX_TOKENS=2048
ENV LLM_TIMEOUT=60

ENTRYPOINT ["python", "-u", "main.py"]

Step 3 — Build the Image Locally

You build the image yourself — no remote registry needed. This keeps you in full control.

# From your project directory (where the Dockerfile lives)
cd my-agent-project
docker build -t agentos:latest .

First build takes 30–60 seconds. Subsequent builds are fast — Docker caches layers unless you change the Python source files.

Verify the image exists:

docker images agentos

Output:

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
agentos      latest    abc123def456   2 minutes ago   180MB

Step 4 — Prepare Your Data Directory

AgentOS stores all persistent state in a host directory mounted at /data. Create it before running:

mkdir -p my-agent-data
cp agents.yaml my-agent-data/

The directory must contain agents.yaml at its root — the runtime reads this on startup.

What goes where:

my-agent-data/               ← mounted to /data in container
├── agents.yaml              ← agent definitions (required)
└── agents/                  ← created automatically by bootstrap
    └── orion/               ← one subdirectory per agent
        ├── persona.md
        ├── constraints.md
        ├── skills.md
        ├── goals.md
        ├── rewards.md
        ├── reflections.md
        ├── queue.md
        ├── system_prompt.md
        ├── skills/
        ├── iac/
        └── .git/            ← agent's git history

The agent subdirectories are created automatically on first run — you only need to provide agents.yaml.

Step 5 — Run Your Agent Container

Supply your LLM API key and mount the data directory. The provider is configured via environment variables.

With OpenAI:

docker run -d \
  --name agentos-orion \
  -e LLM_API_KEY="sk-..." \
  -e LLM_BASE_URL="https://api.openai.com/v1" \
  -e LLM_MODEL="gpt-4o" \
  -v $(pwd)/my-agent-data:/data \
  agentos:latest

With Anthropic (via compatible endpoint):

docker run -d \
  --name agentos-orion \
  -e LLM_API_KEY="sk-ant-..." \
  -e LLM_BASE_URL="https://api.anthropic.com/v1" \
  -e LLM_MODEL="claude-sonnet-4-20250514" \
  -v $(pwd)/my-agent-data:/data \
  agentos:latest

With Ollama (local, no cloud costs):

docker run -d \
  --name agentos-local \
  -e LLM_API_KEY="ollama" \
  -e LLM_BASE_URL="http://host.docker.internal:11434/v1" \
  -e LLM_MODEL="llama3.1:8b" \
  -e LLM_MAX_TOKENS=4096 \
  -v $(pwd)/my-agent-data:/data \
  agentos:latest

With Groq:

docker run -d \
  --name agentos-groq \
  -e LLM_API_KEY="gsk_..." \
  -e LLM_BASE_URL="https://api.groq.com/openai/v1" \
  -e LLM_MODEL="llama-3.1-70b-versatile" \
  -v $(pwd)/my-agent-data:/data \
  agentos:latest

Verify it started:

docker ps --filter name=agentos
docker logs agentos-orion

You should see:

2026-04-29T10:00:00 INFO [bootstrap] Initialised git repo at /data/agents/orion
2026-04-29T10:00:00 INFO [agentos] Agent 'orion' ready at /data/agents/orion
2026-04-29T10:00:00 INFO [agentos] AgentOS running with 1 agent(s). Polling every 5s.

Step 6 — Feed Tasks and Watch Your Agent Learn

Tasks arrive by writing a task block into the agent's queue.md file. The agent polls every few seconds, picks up the first task, processes it, and removes it from the queue.

Enqueue a task

Write a task block into the queue file in your mounted data directory:

cat >> my-agent-data/agents/orion/queue.md << 'EOF'

## first-task
task: List your active constraints and explain how the reward system works.
priority: normal
created: 2026-04-29T10:05:00Z
EOF

Important: The leading blank line before ## first-task ensures the block is properly separated from any existing content.

Task block format

Each task block in queue.md must follow this structure:

## <unique-task-id>
task: <description of what the agent should do>
priority: normal|high|low
created: <ISO 8601 timestamp>

The ## header marks the start of a task block. The runtime splits the file on ## headers and processes the first complete block it finds. After processing, it removes the block from the file.

Watch the agent process the task

docker logs -f agentos-orion

You'll see the agent spring into action:

2026-04-29T10:05:05 INFO [agent] [orion] Processing task: first-task
2026-04-29T10:05:12 INFO [agent] [orion] Response:
The active constraints are: skills.md capped at 20 entries,
goals.md capped at 5 active goals, rewards.md keeps last 30 entries,
and reflections.md keeps last 15 entries. The reward system uses
+1 for successful reusable outcomes, 0 for partial, -1 for failures.
Skills are only earned from +1 outcomes...

Verify the agent updated its state

After processing, inspect the files:

# Check the reward was recorded
cat my-agent-data/agents/orion/rewards.md

# Check if a skill was earned (if the task was a +1)
cat my-agent-data/agents/orion/skills.md

# View the agent's git history
git -C my-agent-data/agents/orion log --oneline

Example git log:

* a3f9c12 orion(task): first-task
* 8b2e01a init: agent filesystem

Feed a series of tasks

The agent grows through repeated task cycles. Queue several tasks to watch it build skills:

cat >> my-agent-data/agents/orion/queue.md << 'EOF'

## paginated-api
task: Write a Python function that fetches all pages from a paginated REST API endpoint. The API returns a 'next' field in the response body. Handle rate limits gracefully.
priority: high
created: 2026-04-29T10:10:00Z
EOF

cat >> my-agent-data/agents/orion/queue.md << 'EOF'

## sql-parameterization
task: Write a function that builds a parameterized SQL SELECT query from a table name, a list of column names, and a dict of WHERE conditions. Prevent SQL injection.
priority: high
created: 2026-04-29T10:15:00Z
EOF

After each task completes, check the agent's growing skill library:

cat my-agent-data/agents/orion/skills.md

Over multiple tasks, you'll see skills accumulate (and old ones pruned when the budget fills) — the agent evolving exactly as designed in Part 0.

Stopping and restarting

Stop the container:

docker stop agentos-orion

All state is on the host in my-agent-data/. Restart with the same volume mount:

docker start agentos-orion

The agent resumes polling its queue. Any tasks you wrote to queue.md while stopped will be picked up immediately.

Rebuilding with a new image version

When you modify the Python source files, rebuild and restart:

docker build -t agentos:v2 .
docker stop agentos-orion
docker rm agentos-orion
docker run -d \
  --name agentos-orion \
  -e LLM_API_KEY="sk-..." \
  -v $(pwd)/my-agent-data:/data \
  agentos:v2

Your agent's skills, goals, rewards, and git history survive because they live on the mounted volume — not inside the container.

Model‑Agnostic Design

The runtime makes zero assumptions about which LLM you use. The contract is simple:

Endpoint: Any URL that accepts POST /chat/completions with OpenAI‑compatible JSON.
Auth: Bearer <token> in the Authorization header.
Response: Standard {"choices": [{"message": {"content": "..."}}]} format.

This covers: OpenAI, Anthropic (via adapter), Ollama, Groq, Together AI, Fireworks, DeepInfra, vLLM, llama.cpp, LiteLLM proxy, and any self‑hosted model behind an OpenAI‑compatible wrapper.

To switch providers, change three environment variables. The image and config stay the same.

Customization Patterns

Pattern A — Specialize an Agent for a Domain

agents:
  - name: iris
    persona: "Data analyst specialized in anomaly detection on time‑series data. Communicates with concise summaries."
    domain: "Data analytics"
    tone: "Query‑oriented, evidence‑first"
    hints:
      - "Most tasks involve SQL with outlier detection."
      - "Always validate data completeness before analysis."
    goals:
      - "Automatically validate data completeness before analysis"

Pattern B — Multiple Agents in One Container

agents:
  - name: orion
    persona: ...
    domain: "Backend engineering"
    hints: [...]
  - name: iris
    persona: ...
    domain: "Data analytics"
    hints: [...]
  - name: nova
    persona: ...
    domain: "Technical writing"
    hints: [...]

All share the container but have independent file trees, git histories, and queues. Coordination via files (handoffs, shared segments) is covered in Part 2.

Pattern C — Persistent State Across Upgrades

The /data volume is your agent's long‑term memory. Build a new image locally and restart:

docker build -t agentos:v2 .
docker stop agentos-v1 && docker rm agentos-v1
docker run -d --name agentos-v2 -v $(pwd)/my-agent-data:/data agentos:v2

The agent wakes up with all earned skills, reward logs, and git history intact.

Observing Your Agent's Growth

Everything is plain files and git. No special tools required.

# Current skills
docker exec agentos-orion cat /data/agents/orion/skills.md

# Reward history
docker exec agentos-orion cat /data/agents/orion/rewards.md

# Learning timeline (from host)
git -C my-agent-data/agents/orion log --oneline --graph

# Live activity stream
docker logs -f agentos-orion

# Check queue for pending tasks
cat my-agent-data/agents/orion/queue.md

# Count skills earned
docker exec agentos-orion grep -c "^### " /data/agents/orion/skills.md

# See the last 5 reward outcomes
docker exec agentos-orion grep "^-\s*reward:" /data/agents/orion/rewards.md | tail -5

Next Steps

Add a REST API to push tasks programmatically instead of writing to queue files.
Implement IAC primitives (Part 2) — handoff files, shared segments, queues for multi‑agent collaboration.
Scale with Docker Compose — per‑agent containers, shared network, optional shared git remote.
Harden for production — healthchecks, resource limits, read‑only root filesystem, secret management.
Add a watchdog — script that monitors rewards.md and alerts on sustained -1 streaks.

Complete Examples

Example A — Code Review Agent (with Ollama, fully local)

Goal: An agent that reviews code snippets, identifies bugs and anti‑patterns, and builds a library of recurring issues with detection heuristics.

agents.yaml:

agents:
  - name: revu
    persona: "Senior code reviewer. Knows Python, TypeScript, and Go. Identifies logic errors, security vulnerabilities, and style violations. Cites specific lines. Suggests concrete fixes with before/after diffs. Never approves without evidence."
    domain: "Code review"
    tone: "Direct, evidence‑based, line‑specific"
    hints:
      - "Common issues: missing input validation, race conditions, SQL injection vectors, resource leaks."
      - "Suggestions should include before/after code blocks."
      - "Track recurring anti‑patterns across reviews."
    goals:
      - "Build a library of recurring anti‑patterns with detection heuristics"
      - "Learn framework‑specific security pitfalls (Django, Express, Gin)"

Run (Ollama must be running locally):

mkdir -p revu-data
cp agents.yaml revu-data/

docker run -d \
  --name revu \
  -e LLM_API_KEY="ollama" \
  -e LLM_BASE_URL="http://host.docker.internal:11434/v1" \
  -e LLM_MODEL="llama3.1:8b" \
  -e LLM_MAX_TOKENS=4096 \
  -v $(pwd)/revu-data:/data \
  agentos:latest

Feed a review task:

cat >> revu-data/agents/revu/queue.md << 'EOF'

## review-flask-endpoint
task: |
  Review the following Python code for security issues and bugs.
  
  ```python
  @app.route('/user/<username>')
  def get_user(username):
      query = f"SELECT * FROM users WHERE name = '{username}'"
      result = db.execute(query)
      return jsonify(result.fetchone())

Provide specific line-by-line findings with fixes. priority: high created: 2026-04-29T10:30:00Z EOF


**Watch it learn:**

```bash
docker logs -f revu

Example B — Legal Document Analyst (with OpenAI)

Goal: An agent that reviews contract clauses, flags risky language, and builds a knowledge base of problematic patterns across documents.

agents.yaml:

agents:
  - name: lex
    persona: "Contract analyst specializing in software licensing agreements, NDAs, and service level agreements. Identifies ambiguous language, missing clauses, and unfavorable terms. Cites specific sections. Suggests neutral alternative language. Conservative by default — flags anything uncertain."
    domain: "Legal document review"
    tone: "Precise, conservative, section‑cited"
    hints:
      - "Common risks: broad indemnification, missing termination clauses, vague SLAs."
      - "Always quote the original clause before suggesting alternatives."
      - "Track clause patterns that recur across documents."
    goals:
      - "Build a catalog of high‑risk clause patterns"
      - "Learn to flag jurisdiction‑specific issues (GDPR, CCPA, EU AI Act)"

Run (with OpenAI):

mkdir -p lex-data
cp agents.yaml lex-data/

docker run -d \
  --name lex \
  -e LLM_API_KEY="sk-..." \
  -e LLM_BASE_URL="https://api.openai.com/v1" \
  -e LLM_MODEL="gpt-4o" \
  -v $(pwd)/lex-data:/data \
  agentos:latest

Feed a contract review:

cat >> lex-data/agents/lex/queue.md << 'EOF'

## review-nda-clause
task: |
  Review this NDA clause and identify risks:
  
  "The Receiving Party agrees to hold all Confidential Information
  in strict confidence for a period of three (3) years from the
  date of disclosure, except for information that is independently
  developed, which shall remain confidential indefinitely."
  
  Flag any ambiguities or one-sided terms.
priority: high
created: 2026-04-29T11:00:00Z
EOF

Example C — Creative Writing Coach (with Groq)

Goal: An agent that provides developmental feedback on fiction drafts, learns an author's voice over time, and builds skill modules for specific craft elements (dialogue, pacing, description).

agents.yaml:

agents:
  - name: scribe
    persona: "Developmental editor and writing coach. Specializes in fiction — novels and short stories. Focuses on structure, pacing, character voice, and emotional resonance. Gives specific, actionable feedback with examples. Encouraging but honest. Remembers the author's recurring patterns."
    domain: "Creative writing"
    tone: "Encouraging, specific, craft‑focused"
    hints:
      - "Common feedback areas: show‑don't‑tell, dialogue tags, pacing in action scenes."
      - "Always cite specific passages with line references."
      - "Build a profile of the author's voice and recurring habits."
    goals:
      - "Track an author's voice patterns across submissions"
      - "Build skill modules for: dialogue mechanics, scene pacing, sensory description"

Run (with Groq — fast and cost‑effective):

mkdir -p scribe-data
cp agents.yaml scribe-data/

docker run -d \
  --name scribe \
  -e LLM_API_KEY="gsk_..." \
  -e LLM_BASE_URL="https://api.groq.com/openai/v1" \
  -e LLM_MODEL="llama-3.1-70b-versatile" \
  -v $(pwd)/scribe-data:/data \
  agentos:latest

Feed a manuscript excerpt:

cat >> scribe-data/agents/scribe/queue.md << 'EOF'

## feedback-chapter1
task: |
  Provide developmental feedback on this opening passage:
  
  "The door creaked open. Sarah walked into the room. She was
  scared. The room was dark and cold. She saw a shadow move in
  the corner. She screamed and ran away."
  
  Focus on: show-don't-tell, sensory detail, pacing.
priority: normal
created: 2026-04-29T12:00:00Z
EOF

Closing Thoughts

You've now turned the AgentOS concept into a physical system:

The filesystem stores its mind.
The git repository remembers its history.
The Docker container gives it a home — built locally, under your control.
The agents.yaml defines its purpose.
The environment variables decouple the LLM provider from the agent's identity.

I'm interested in benchmarking before the series continues with advanced patterns

MuhammadYossry/build_agents_container.md

Select an option

No results found

Select an option

No results found

Part 4: Containerizing AgentOS — Build, Customize, and Deploy Your Self‑Learning Agent

Table of Contents

Introduction

Why Docker Matters for AgentOS

Prerequisites

Project Layout: The Files You'll Own

Step 1 — Define Your Agent(s) in `agents.yaml`

Step 2 — Write the Dockerfile

Step 3 — Build the Image Locally

Step 4 — Prepare Your Data Directory

Step 5 — Run Your Agent Container

Step 6 — Feed Tasks and Watch Your Agent Learn

Enqueue a task

Task block format

Watch the agent process the task

Verify the agent updated its state

Feed a series of tasks

Stopping and restarting

Rebuilding with a new image version

Model‑Agnostic Design

Customization Patterns

Pattern A — Specialize an Agent for a Domain

Pattern B — Multiple Agents in One Container

Pattern C — Persistent State Across Upgrades

Observing Your Agent's Growth

Next Steps

Complete Examples

Example A — Code Review Agent (with Ollama, fully local)

Example B — Legal Document Analyst (with OpenAI)

Example C — Creative Writing Coach (with Groq)

Closing Thoughts

MuhammadYossry commented Apr 29, 2026 •

edited

Loading

Uh oh!

MuhammadYossry/build_agents_container.md

Part 4: Containerizing AgentOS — Build, Customize, and Deploy Your Self‑Learning Agent

Table of Contents

Introduction

Why Docker Matters for AgentOS

Prerequisites

Project Layout: The Files You'll Own

Step 1 — Define Your Agent(s) in agents.yaml

Step 2 — Write the Dockerfile

Step 3 — Build the Image Locally

Step 4 — Prepare Your Data Directory

Step 5 — Run Your Agent Container

Step 6 — Feed Tasks and Watch Your Agent Learn

Enqueue a task

Task block format

Watch the agent process the task

Verify the agent updated its state

Feed a series of tasks

Stopping and restarting

Rebuilding with a new image version

Model‑Agnostic Design

Customization Patterns

Pattern A — Specialize an Agent for a Domain

Pattern B — Multiple Agents in One Container

Pattern C — Persistent State Across Upgrades

Observing Your Agent's Growth

Next Steps

Complete Examples

Example A — Code Review Agent (with Ollama, fully local)

Example B — Legal Document Analyst (with OpenAI)

Example C — Creative Writing Coach (with Groq)

Closing Thoughts

MuhammadYossry commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Solving the Tensions

Reward Honesty – The Audit Hook

IAC Backpressure – Throughput Classes & TTL

Skill Decay – last_success & Recency Pruning

Uh oh!

Step 1 — Define Your Agent(s) in `agents.yaml`

MuhammadYossry commented Apr 29, 2026 •

edited

Loading

Skill Decay – `last_success` & Recency Pruning