Skip to content

Instantly share code, notes, and snippets.

@dabit3
Last active March 27, 2026 21:41
Show Gist options
  • Select an option

  • Save dabit3/646e63b3b8fe4a8e0309db78d0a52553 to your computer and use it in GitHub Desktop.

Select an option

Save dabit3/646e63b3b8fe4a8e0309db78d0a52553 to your computer and use it in GitHub Desktop.
Ramp Built Inspect in Months. Here's How to Ship It in a Weekend.

Ramp Built Inspect in Months. Here's How to Ship It in a Weekend.

Ramp built Inspect, an incredible internal background coding agent responsible for ~30% of their merged PRs. Building it required months of infrastructure work: managed sandboxes, agent orchestration, session state, and custom tooling.

Cloud infrastructure like AWS and Cloudflare did away with the need to manage physical servers, patch operating systems, and maintain uptime. Instead of managing infrastructure, developers could focus on their application logic while the cloud provider handled the rest.

Devin applies the same principle to coding agents: the sandbox, dev environment, agent runtime, orchestration, integrations, and tooling are all fully managed. You bring the prompt; Devin handles everything behind it.

Here's how to replicate Inspect's functionality on top of the Devin API. You skip the months of infrastructure work and go straight to building the parts unique to your team.

Let's dive right in!

Authentication

Create a service user in your Devin org (Settings > Service Users). You get an API key prefixed with cog_. All examples below load credentials from a .env file:

# .env
DEVIN_API_KEY=cog_your_key_here
DEVIN_ORG_ID=your_org_id

Base URL: https://api.devin.ai/v3

The Sandbox Problem (Solved)

Ramp's spec spends significant effort on sandbox infrastructure: building images every 30 minutes, snapshotting file systems on Modal, syncing git state, warming sandbox pools, and managing lifecycle. This is the hardest part of their build.

Devin sessions already run in isolated Linux VMs with full dev environments. Each session boots from a saved snapshot with your repos cloned, dependencies installed, and environment configured. The machine has shell, IDE, browser, Docker, and any runtime you need.

You configure this once in Devin's repo setup. After that, every API-created session starts from that snapshot automatically. No image registry, no cron job rebuilding images, no pool management.

Core Loop: Create, Monitor, Act

Start a session

curl -X POST "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/sessions" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Fix the bug described in issue #42 in the backend repo"}'

You can also pass max_acu_limit to cap compute usage per session. This is important for automated pipelines where a runaway session could burn credits.

Response:

{
  "session_id": "abc123",
  "url": "https://app.devin.ai/sessions/devin-abc123",
  "status": "running"
}

Note: The API returns a session_id (e.g., abc123), but the session endpoints expect a devin_id, which is the session ID prefixed with devin- (e.g., devin-abc123). Build the devin_id by prepending devin- to the session_id.

Poll for completion

import os, time, requests
from dotenv import load_dotenv

load_dotenv()

API_KEY = os.environ["DEVIN_API_KEY"]
ORG_ID = os.environ["DEVIN_ORG_ID"]
BASE = "https://api.devin.ai/v3"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={"prompt": "Fix the bug described in issue #42"}
).json()

devin_id = f"devin-{session['session_id']}"
print(f"Session created: {session.get('url')}")

while True:
    details = requests.get(
        f"{BASE}/organizations/{ORG_ID}/sessions/{devin_id}",
        headers=HEADERS
    ).json()
    status = details["status"]
    status_detail = details.get("status_detail")  # e.g. "working", "waiting_for_user", "finished"
    print(f"Status: {status} ({status_detail})")
    if status in ("exit", "error", "suspended"):
        break
    time.sleep(10)

print(f"Session ended with status: {status}")

The status_detail field gives you finer-grained state: working, waiting_for_user, waiting_for_approval, or finished while a session is running, and reasons like inactivity or usage_limit_exceeded when suspended.

Read messages back

messages = requests.get(
    f"{BASE}/organizations/{ORG_ID}/sessions/{devin_id}/messages",
    headers=HEADERS
).json()["items"]

for msg in messages:
    print(f"[{msg['source']}] {msg['message']}")

Send follow-up messages

Ramp queues follow-up prompts sent during execution. You can do the same:

curl -X POST "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/sessions/devin-abc123/messages" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Also add unit tests for the fix"}'

Terminate a session

Kill a runaway session or archive a completed one:

# Terminate
curl -X DELETE "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/sessions/devin-abc123" \
  -H "Authorization: Bearer $DEVIN_API_KEY"

# Archive instead of delete
curl -X DELETE "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/sessions/devin-abc123?archive=true" \
  -H "Authorization: Bearer $DEVIN_API_KEY"

Session Attribution (Multiplayer)

Ramp calls multiplayer "mission-critical." They want multiple people contributing to a session, with commits attributed to the right author.

Devin's API supports this with create_as_user_id. A service user with the ImpersonateOrgSessions permission can create sessions on behalf of any org member:

session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "prompt": "Implement the settings page redesign",
        "create_as_user_id": "user-id-of-designer"
    }
).json()

The session appears in that user's session list. PRs are attributed to them. Any team member can open the session URL and send messages.

Structured Output (Machine-Readable Results)

Ramp streams tokens in real time to their clients. With Devin, you get structured output: pass a JSON Schema via the structured_output_schema parameter, and Devin validates and updates it as it works. You poll the session to read it.

session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "prompt": "Review PR #249. Check for bugs, security issues, and style violations.",
        "structured_output_schema": {
            "type": "object",
            "properties": {
                "issues": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "file": {"type": "string"},
                            "line": {"type": "integer"},
                            "type": {"type": "string"},
                            "description": {"type": "string"}
                        }
                    }
                },
                "suggestions": {"type": "array", "items": {"type": "string"}},
                "approved": {"type": "boolean"}
            }
        }
    }
).json()

devin_id = f"devin-{session['session_id']}"

# Poll for structured output
details = requests.get(
    f"{BASE}/organizations/{ORG_ID}/sessions/{devin_id}",
    headers=HEADERS
).json()

review = details.get("structured_output")  # Your JSON, validated and updated by Devin as it works

This is how you pipe Devin's work into your own UI, Slack bot, or dashboard.

Spawning Child Sessions (Parallel Agents)

Ramp's spec recommends a tool that lets the agent spawn sub-sessions for research or splitting a large task into smaller PRs. Devin has this built in with Managed Devins. You don't need to write orchestration code; just ask Devin in natural language to parallelize the work.

Devin acts as a coordinator: it scopes the work, spins up child sessions (each running in its own isolated VM), monitors progress, resolves conflicts, and compiles results.

For example, to parallelize a migration across multiple tables:

Analyze our codebase for all files using the legacy REST client.
Group them into independent work packages that won't conflict,
then start a parallel Devin session for each package to migrate
to the new GraphQL client. Use the "REST to GraphQL Migration"
playbook for each session.

Or to run the same task across multiple modules:

Run the test coverage report, find the 8 modules below 50%
coverage, and start a parallel Devin session for each module
using our test-writing playbook. Open a separate PR for each.

Devin analyzes your request and proposes the sessions for your approval before launching them. The coordinator can also message child sessions, monitor their ACU consumption, and terminate stuck sessions.

Knowledge and Playbooks (Encoding How You Ship)

Ramp uses "skills that encode how we ship at Ramp." Devin has three mechanisms for this:

Knowledge: persistent context

Tips, standards, and instructions that Devin recalls automatically based on trigger descriptions.

curl -X POST "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/knowledge/notes" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Test requirements",
    "trigger": "When opening a pull request or fixing a bug",
    "body": "Always run the full test suite before opening a PR. Use pytest. Minimum 80% coverage on changed files."
  }'

Playbooks: reusable task templates

System prompts for repeated workflows. Attach them to sessions by ID.

# Create a playbook
playbook = requests.post(
    f"{BASE}/organizations/{ORG_ID}/playbooks",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "title": "PR Review",
        "body": "Review the PR for bugs, security issues, and style violations. Leave inline comments. Run tests. Check for breaking changes."
    }
).json()

# Use it in a session
session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "prompt": "Review PR #312",
        "playbook_id": playbook["playbook_id"]
    }
).json()

Skills: repo-committed procedures

SKILL.md files committed to your repo at .agents/skills/<name>/SKILL.md. Devin discovers them automatically. These follow the open Agent Skills standard, so they work across multiple AI tools.

Secrets Management

Ramp wires Inspect into Sentry, Datadog, LaunchDarkly, and more. Each integration needs credentials. Devin handles this two ways:

Organization secrets (persistent, shared):

curl -X POST "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/secrets" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type": "key-value", "key": "DATADOG_API_KEY", "value": "dd-key-here"}'

Session secrets (temporary, single session):

session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "prompt": "Deploy to staging",
        "session_secrets": [
            {"key": "DEPLOY_TOKEN", "value": "temp-token-xyz", "sensitive": True}
        ]
    }
).json()

MCP Servers (External Tool Integration)

Ramp connects Inspect to Sentry, Datadog, LaunchDarkly, Braintrust, GitHub, Slack, and Buildkite. In Devin, you configure these as MCP (Model Context Protocol) servers. Devin's marketplace has 40+ pre-configured integrations: Datadog, Sentry, Linear, Slack, Figma, PostgreSQL, BigQuery, and more.

For custom internal tools, add your own MCP server using STDIO, SSE, or HTTP transport. Configuration is done in the Devin UI under Settings.

Scheduled Sessions (Self-Maintaining Systems)

Ramp's second blog post describes a system that monitors production and auto-triages alerts. The scheduled sessions API lets you set this up directly:

# Run a health check every weekday at 9 AM Eastern
curl -X POST "https://api.devin.ai/v3/organizations/$DEVIN_ORG_ID/schedules" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Weekday health check",
    "prompt": "Run the full test suite against main. If any tests fail, investigate the failure, fix it, and open a PR.",
    "frequency": "0 9 * * 1-5"
  }'

For Ramp's monitor-driven pattern (Datadog alert fires, agent investigates), use a webhook that calls the Devin API:

# Webhook handler (Flask example)
import os
from flask import Flask, request
import requests as http
from dotenv import load_dotenv

load_dotenv()

app = Flask(__name__)

API_KEY = os.environ["DEVIN_API_KEY"]
ORG_ID = os.environ["DEVIN_ORG_ID"]

@app.route("/datadog-webhook", methods=["POST"])
def handle_alert():
    alert = request.json
    monitor_name = alert.get("monitor_name", "Unknown")
    alert_body = alert.get("body", "")

    session = http.post(
        f"https://api.devin.ai/v3/organizations/{ORG_ID}/sessions",
        headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
        json={
            "prompt": f"Datadog alert fired: {monitor_name}\n\n{alert_body}\n\nInvestigate, reproduce in the test suite, and fix if possible. Open a PR with the fix.",
            "tags": ["auto-triage", "datadog"]
        }
    ).json()

    return {"session_id": session["session_id"], "url": session["url"]}

File Attachments

Upload files for Devin to work with:

import os
import requests
from dotenv import load_dotenv

load_dotenv()

API_KEY = os.environ["DEVIN_API_KEY"]
ORG_ID = os.environ["DEVIN_ORG_ID"]
BASE = "https://api.devin.ai/v3"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Upload a file
with open("design-spec.pdf", "rb") as f:
    response = requests.post(
        f"{BASE}/organizations/{ORG_ID}/attachments",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f}
    )
file_url = response.json()["url"]

# Reference it in a session
session = requests.post(
    f"{BASE}/organizations/{ORG_ID}/sessions",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={
        "prompt": "Implement the design in the attached spec.",
        "attachment_urls": [file_url]
    }
).json()

Download files produced by a session (screenshots, generated code, logs):

devin_id = f"devin-{session['session_id']}"

attachments = requests.get(
    f"{BASE}/organizations/{ORG_ID}/sessions/{devin_id}/attachments",
    headers=HEADERS
).json()["items"]

for att in attachments:
    content = requests.get(att["url"]).content
    with open(att["name"], "wb") as f:
        f.write(content)

Building a Slack Bot

Devin already has a native Slack integration that lets your team start sessions, send follow-ups, and get updates directly from Slack. For most teams, this is all you need.

If you want custom behavior (routing messages to specific repos, enriching prompts with context from other systems, or wiring into your own internal tools), you can build your own bot on top of the API. Here's a minimal version:

import os
from slack_bolt import App
import requests as http
from dotenv import load_dotenv

load_dotenv()

app = App(token=os.environ["SLACK_BOT_TOKEN"], signing_secret=os.environ["SLACK_SIGNING_SECRET"])

API_KEY = os.environ["DEVIN_API_KEY"]
ORG_ID = os.environ["DEVIN_ORG_ID"]
BASE = "https://api.devin.ai/v3"

@app.message("")
def handle_message(message, say):
    text = message["text"]
    channel = message["channel"]

    # Create a Devin session
    session = http.post(
        f"{BASE}/organizations/{ORG_ID}/sessions",
        headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
        json={
            "prompt": text,
            "tags": ["slack", f"channel-{channel}"]
        }
    ).json()

    say(f"On it. Session: {session['url']}")

if __name__ == "__main__":
    app.start(port=3000)

Ramp recommends building a classifier to determine which repo to work in. You can do this with a fast model call before creating the session, or let Devin figure it out from the prompt and your connected repos.

Building a GitHub Actions Integration

Trigger Devin on PR events:

# .github/workflows/devin-review.yml
name: Devin PR Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger Devin Review
        run: |
          curl -X POST "https://api.devin.ai/v3/organizations/${{ secrets.DEVIN_ORG_ID }}/sessions" \
            -H "Authorization: Bearer ${{ secrets.DEVIN_API_KEY }}" \
            -H "Content-Type: application/json" \
            -d '{
              "prompt": "Review PR ${{ github.event.pull_request.html_url }}. Check for bugs, security issues, test coverage, and style. Leave comments on the PR.",
              "tags": ["pr-review", "automated"]
            }'

What You Skip

Here's what Ramp built that you don't need to:

  • Modal sandbox infrastructure, image registry, 30-min rebuild cron → Managed VMs with snapshots, repo setup
  • OpenCode agent integration → Devin agent (IDE, shell, browser, computer use)
  • Cloudflare Durable Objects for session state → Session API with messages, status, attachments
  • WebSocket streaming infrastructure → Polling API + structured output (your client polls; no push notifications)
  • Sandbox snapshotting and restore → Built-in session snapshots
  • Git config management per user → Session attribution via create_as_user_id
  • Custom tool system → MCP servers (40+ integrations + custom)
  • Warm sandbox pool management → Automatic environment provisioning

What you still build: your custom clients. A Chrome extension, a web dashboard, a GitHub webhook handler, or any integration beyond what Devin provides out of the box. These are the parts unique to how your team works. The Devin API gives you the backend for all of them.

Full Example: Webhook-Driven Bug Fix Pipeline

Putting it all together: a service that receives bug reports from any source and drives them to merged PRs:

import os
import time
import requests
from dotenv import load_dotenv

load_dotenv()

API_KEY = os.environ["DEVIN_API_KEY"]
ORG_ID = os.environ["DEVIN_ORG_ID"]
BASE = "https://api.devin.ai/v3"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def handle_bug_report(title, description, source="manual"):
    # 1. Create session with structured output
    session = requests.post(
        f"{BASE}/organizations/{ORG_ID}/sessions",
        headers=HEADERS,
        json={
            "prompt": f"Bug report: {title}\n\n{description}\n\nInvestigate this bug. Reproduce it with a test. Fix it. Open a PR.",
            "structured_output_schema": {
                "type": "object",
                "properties": {
                    "status": {"type": "string", "enum": ["investigating", "reproducing", "fixing", "pr_opened"]},
                    "root_cause": {"type": ["string", "null"]},
                    "test_added": {"type": "boolean"},
                    "pr_url": {"type": ["string", "null"]}
                }
            },
            "tags": ["bug-fix", f"source-{source}"]
        }
    ).json()

    session_id = session["session_id"]
    devin_id = f"devin-{session_id}"
    print(f"Session started: {session['url']}")

    # 2. Poll until done
    while True:
        details = requests.get(
            f"{BASE}/organizations/{ORG_ID}/sessions/{devin_id}",
            headers={"Authorization": f"Bearer {API_KEY}"}
        ).json()

        status = details["status"]
        output = details.get("structured_output", {})

        if output.get("pr_url"):
            print(f"PR opened: {output['pr_url']}")

        if status in ("exit", "error", "suspended"):
            break

        time.sleep(15)

    return details.get("structured_output")

# Usage
result = handle_bug_report(
    title="Login button unresponsive on mobile",
    description="Users on iOS Safari report the login button does nothing on tap. No console errors. Started after last deploy."
)

Summary

Building infrastructure in-house makes sense in certain situations and with certain constraints, but for most teams, the fastest path to value is using what already exists and focusing your engineering effort on the problems only your team can solve.

Ramp's Inspect is a well-engineered tool. Building it required Modal, OpenCode, Cloudflare Durable Objects, a custom sandbox lifecycle, git integration plumbing, and months of infrastructure work.

The Devin API compresses all of this into a set of REST endpoints. The sessions run in fully provisioned VMs with your repos, dependencies, and tools already set up.

You just build the client. The API handles everything behind it.

And if you don't want to build anything at all, you can just use Devin.

Start here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment