Delivery Twin · Enterprise OpenClaw Blueprint

Executive Summary

Delivery Twin should be packaged as a controlled enterprise distribution on top of OpenClaw, not as a one-off local setup. The goal is to give consultants a repeatable workstation capability: install once, authenticate against approved systems, attach a client repository and board, ingest backlog and code context, and start assisting delivery work within minutes.

Outcome

A consultant can connect a client engagement and generate a first delivery brief, technical map, risk radar, and PR review workflow without hand-assembling tools.

Boundary

Client information remains on the company laptop except for calls to the company-approved enterprise LLM endpoint.

Fastest path

Use pinned upstream OpenClaw, a corporate configuration profile, a Delivery Twin skill/plugin pack, a wrapper CLI, and an installer.

Recommended starting point: a dedicated WSL2 environment on Windows 11 for fast adoption, with a Hyper-V option for stricter isolation requirements.

Design Principles

Separate generic capability from client data. The Delivery Twin repository contains reusable instructions, skills, templates, policies, and connectors. Client repositories and extracted client context live in separate workspaces.
Make installation repeatable. The setup must behave like an internal product, not tribal knowledge. The installer performs preflight checks, prepares the environment, and validates connectivity.
Use one approved LLM route. Personal LLM providers must not be configured as fallback. The enterprise LLM endpoint is the only allowed destination for prompts containing client context.
Send the smallest useful context. Local indexing, graph extraction, repo maps, diff filtering, and retrieval should run before any LLM call.
Prefer draft mode before write mode. The first versions should generate local reports or draft comments. Writing to Slack, Azure DevOps comments, or backlog items should require explicit confirmation.
Audit usage without hoarding sensitive prompts. Track model, cost, repository, PR, command, and result summary. Avoid storing full prompts or raw client data unless policy explicitly allows it.

What “Custom OpenClaw” Means

A custom OpenClaw for Delivery Twin should be a corporate distribution built on top of upstream OpenClaw. It should not start as a fork. Forking should be reserved for cases where core OpenClaw cannot enforce a mandatory security, identity, runtime, or UI requirement.

Customization levels

Level	Meaning	Value	Recommendation
Custom configuration	A corporate profile: approved provider, model defaults, workspace paths, tool allow/deny lists, agent definitions, and logging policy.	Prevents accidental use of personal providers and establishes a consistent operating boundary.	Always
Skill/plugin pack	Delivery Twin capabilities for Azure DevOps, Slack, backlog ingestion, PR review, discovery, reporting, and delivery rituals.	Turns generic OpenClaw into a delivery assistant that understands the company workflow.	Core value
Wrapper CLI	A product-facing command such as `delivery-twin` that installs, validates, connects clients, and runs workflows.	Gives consultants a simple contract instead of exposing internal setup details.	Recommended
Installer	A signed PowerShell script, MSIX package, internal winget package, or prebuilt WSL image.	Makes rollout repeatable across corporate laptops.	Needed for rollout
Fork	A modified copy of OpenClaw core.	Allows deep changes to Gateway, runtime, UI, or enforcement when configuration and plugins are insufficient.	Last resort

Corporate configuration

This layer should be declarative and easy to inspect. It establishes the operating rules but does not contain client-specific information.

{
  "deliveryTwin": {
    "mode": "enterprise",
    "allowedProviders": ["company-llm"],
    "defaultModel": "enterprise-mini",
    "escalationModel": "enterprise-frontier",
    "workspaceRoot": "~/.delivery-twin",
    "clientsRoot": "~/.delivery-twin/clients",
    "requireHumanConfirmationForSlack": true,
    "disablePersonalProviderFallbacks": true
  }
}

Wrapper CLI

The wrapper is the user experience. It can call OpenClaw internally, write configuration, install skills, validate policies, invoke Azure DevOps APIs, and orchestrate local analysis tools.

delivery-twin install
delivery-twin doctor
delivery-twin login devops
delivery-twin login slack
delivery-twin login llm
delivery-twin attach --client CLIENT --repo URL --board URL
delivery-twin ingest backlog --client CLIENT
delivery-twin pr review --client CLIENT --id 1234 --mode draft

When a fork is justified

Security enforcement must happen inside the Gateway and cannot be expressed through configuration.
Corporate identity integration requires core runtime changes.
The control plane or UI must become a branded internal product and cannot be layered externally.
Tool surfaces must be removed or constrained at a level plugins cannot control.
Legal or compliance requirements demand a fully controlled build of the runtime.

Forking has a real carrying cost: rebasing on upstream, validating every security update, maintaining release pipelines, and owning bugs that upstream may already have fixed. The safer first move is a distribution, not a fork.

Target Architecture

The architecture has three layers: the corporate distribution, the client workspace, and approved external systems. The company laptop is the operational perimeter. The approved enterprise LLM is the only external AI destination for client-sensitive context.

Agent Configuration

The recommended Delivery Twin setup uses two agents with different responsibilities: a lightweight Slack-facing router and a stronger delivery worker. The router keeps Slack cheap, controlled, and concise. The worker performs deeper delivery reasoning, backlog analysis, repository inspection, planning, and PR review.

Roles

Agent	Model tier	Primary job	Allowed behavior
`slack-router`	Mini	Slack triage, short answers, clarification, routing, and response relay.	Answer trivial questions, ask one concise clarification, escalate delivery/project/software questions, and keep Slack messages compact.
`delivery-twin`	Strong	Delivery reasoning, planning, backlog ingestion, repo analysis, PR review, and risk assessment.	Work from the desired outcome, reduce scope to verifiable slices, track risks and blockers, and return concise Slack-ready answers when invoked by the router.

Why split the agents

Cost control: Slack receives many low-value messages. A mini router avoids spending the strong model on greetings, vague prompts, or simple status checks.
Noise control: Slack replies should be short. The router can compress worker output into a format that fits the channel.
Blast-radius control: The Slack-facing agent does not need broad repository inspection by default.
Clear escalation: The worker receives a precise task: the original wording, relevant Slack context, requested output, and any constraints.
Future governance: The pattern allows stricter tool and credential boundaries per agent as OpenClaw configuration matures.

Agent list

The corporate distribution should generate an agent configuration like this, using enterprise model names and company workspace paths.

{
  "agents": {
    "list": [
      {
        "id": "delivery-twin",
        "name": "delivery-twin",
        "workspace": "~/.delivery-twin/workspace-delivery-twin",
        "agentDir": "~/.delivery-twin/agents/delivery-twin",
        "model": "company/enterprise-frontier",
        "identity": {
          "name": "Delivery Twin",
          "emoji": "🚚"
        }
      },
      {
        "id": "slack-router",
        "name": "slack-router",
        "workspace": "~/.delivery-twin/workspace-slack-router",
        "agentDir": "~/.delivery-twin/agents/slack-router",
        "model": "company/enterprise-mini",
        "identity": {
          "name": "Slack Router",
          "emoji": "🚦"
        }
      }
    ]
  }
}

Slack account binding

The Slack app should bind to the router, not directly to the worker. Tokens must be stored as secrets, never committed to the Delivery Twin repository.

{
  "channels": {
    "slack": {
      "enabled": true,
      "mode": "socket",
      "accounts": {
        "delivery-twin": {
          "name": "Delivery Twin",
          "enabled": true,
          "botToken": "${SECRET:SLACK_BOT_TOKEN}",
          "appToken": "${SECRET:SLACK_APP_TOKEN}",
          "groupPolicy": "allowlist",
          "allowFrom": ["${USER_OR_GROUP_ID}"],
          "channels": {
            "${CLIENT_CHANNEL_ID}": {
              "enabled": true,
              "requireMention": false
            }
          },
          "slashCommand": {
            "enabled": true,
            "name": "delivery-twin",
            "ephemeral": false
          }
        }
      }
    }
  },
  "channelBindings": [
    {
      "channel": "slack",
      "accountId": "delivery-twin",
      "agentId": "slack-router"
    }
  ]
}

Router mission

The router should be deliberately narrow. Its instructions should prevent it from becoming a second full delivery agent.

You are Slack Router, a lightweight triage layer for Slack.

Default behavior:
- Answer greetings, status checks, and simple clarifications yourself.
- Ask one concise clarification if the request is too vague to route.
- Escalate delivery, project, software, repository, debugging, planning, or tradeoff questions to `delivery-twin`.
- Keep Slack messages compact.
- Do not load large files, run broad searches, or inspect repositories unless needed for routing.

Escalation contract:
- Preserve the user's original wording and Slack context.
- Tell `delivery-twin` what decision or output is needed.
- Ask for a concise answer suitable for Slack.
- Relay the answer without adding a long wrapper.

Worker mission

The worker should optimize for shippable delivery outcomes, not broad commentary.

You are Delivery Twin, a software delivery specialist.

When helping with a project:
- Clarify the delivery goal and current constraint.
- Identify the smallest useful next increment.
- Make acceptance criteria explicit.
- Track risks, blockers, owners, and dependencies.
- Verify with tests, builds, screenshots, logs, or deployed checks when possible.
- Summarize status in plain language: done, next, blocked, risk.

Operational guardrails

No scheduled jobs by default: client-facing agents should have zero cron jobs unless a specific engagement requires scheduled reports.
Router before worker: Slack events go to slack-router; the worker is invoked only when deeper analysis is needed.
Confirm before posting: PR findings, backlog changes, or summaries with sensitive context should be reviewed before posting to Slack or Azure DevOps.
Separate credentials: enterprise rollout should avoid inherited personal auth profiles. Each agent should use only the company-approved provider and client-approved connectors.
POC caveat: if the current OpenClaw version inherits or merges main auth profiles into non-main agents, do not treat this as a strong confidentiality boundary. Use it as a functional POC, then harden before client-sensitive rollout.

Security Model

Delivery Twin should assume that client code, backlog, comments, logs, and derived summaries are sensitive. The approved enterprise LLM can receive context when the company policy allows it, but every other external route should be blocked or explicitly justified.

Allowed routes

Approved Azure DevOps organizations and projects.
Approved Slack workspace and allowlisted channels.
Approved enterprise LLM endpoint.
Package registries only when required and approved.

Blocked or avoided routes

Personal OpenAI, Anthropic, Gemini, or other API keys.
Paste, upload, telemetry, and analytics services not approved for client data.
Fallback models outside the enterprise account.
Broad synchronization of client workspaces to personal cloud storage.

Isolation options on Windows 11

Dedicated WSL2 distribution: fast, scriptable, and convenient for Node, OpenClaw, Docker, and developer tooling. Disable Windows drive automount if the workspace should not see C:\Users.
Hyper-V virtual machine: heavier, but a clearer isolation boundary for engagements with stricter requirements.
Docker-only packaging: useful for services, but not enough as the main security boundary if host folders or the Docker socket are exposed.

Important: Docker Desktop should not be treated as a confidentiality boundary by itself. It is a packaging tool. The real controls are workspace separation, endpoint allowlists, provider restrictions, and credential isolation.

Enterprise Installer

The installer should behave as a bootstrapper. It prepares the environment, installs or updates the required components, checks policy compliance, and leaves the machine ready to run Delivery Twin workflows.

Installer responsibilities

Check Windows 11, virtualization, WSL2, Docker Desktop if needed, Node runtime, Git, and network reachability.
Create a dedicated Delivery Twin home directory and client workspace root.
Install pinned upstream OpenClaw and the Delivery Twin skill/plugin pack.
Write corporate configuration without embedding client secrets.
Guide login for Azure DevOps, Slack, and the enterprise LLM.
Run delivery-twin doctor and fail closed if personal providers or unsafe paths are detected.

powershell -ExecutionPolicy Bypass -File install-delivery-twin.ps1

delivery-twin doctor
delivery-twin login devops
delivery-twin login slack
delivery-twin login llm
delivery-twin attach --client canfordlaw --repo https://dev.azure.com/org/project/_git/repo --board https://dev.azure.com/org/project/_boards/board

Integrations

Azure DevOps

Azure DevOps is the primary delivery system. The integration should start read-only, then progressively enable write actions behind confirmation gates.

Capability	Minimum permission	Use	Initial stance
Clone repositories	Code Read	Build local indexes and analyze code context.	Enabled
Read pull requests	Code Read	Review diffs, summarize risk, and detect sensitive changes.	Enabled
Comment on pull requests	Code Read & Write	Publish review findings.	Draft mode first
Read Boards	Work Items Read	Ingest backlog, states, ownership, and dependencies.	Enabled
Update backlog	Work Items Read & Write	Create or refine work items.	Later, with confirmation

Slack

Slack should be used for coordination and visibility, not as a raw dump of client code or backlog. Messages should be concise and link back to approved systems.

Allowlist channels per client or engagement.
Require confirmation before posting AI-generated content.
Prefer summaries and links over copied code or raw backlog exports.
Separate a lightweight Slack-facing agent from heavier worker agents when possible.

Enterprise LLM

Use the mini model for routing, classification, first-pass summaries, and low-risk PR checks.
Escalate to the stronger model only for security, authorization, data migrations, architecture, concurrency, and broad multi-module changes.
Use stable prompts and policy blocks to benefit from prompt caching if the provider supports it.
Never fall back to a non-corporate provider automatically.

Backlog Ingestion

The first valuable workflow should connect a repository and board, then generate a usable view of the client delivery system: what is being built, how the backlog is structured, where the risks are, and how the code maps to the work.

Captured data

Epics, features, user stories, bugs, tasks, states, tags, and ownership.
Parent-child relationships, dependencies, blockers, and stale items.
Open and recent PRs, linked commits, active branches, and unlinked work.
Functional areas inferred from folders, naming, tags, and work item language.
Risks such as orphaned work, missing acceptance criteria, repeated bugs, and changes in critical modules.

Generated artifacts

Client brief

Product purpose, backlog structure, key modules, active risks, and the first areas worth investigating.

Delivery map

How work flows from idea to release, including states, bottlenecks, implicit rules, and process gaps.

Technical map

Repositories, entry points, major services, tests, dependencies, and hot spots.

Risk radar

Prioritized delivery and technical risks with evidence and suggested next actions.

PR Review Automation

PR review should use the mini model as the default first pass. The goal is not to replace human review, but to reduce repetitive work, highlight risks, and provide useful context.

Receive a PR event or run review on demand.
Read metadata, linked work item, changed files, and diff.
Classify the change type with the mini model.
Retrieve the minimum relevant local context through graph, repo map, and search.
Run focused reviewers: security, data, authorization, error handling, tests, and architecture.
Generate findings with severity, evidence, file/line reference, and recommendation.
Publish only after confirmation, or keep the output as a local/draft report.

## AI PR Review

Summary:
- Change type: backend validation
- Estimated risk: medium
- Linked work item: DT-1423

Findings:
1. [High] Missing tenant validation in `src/api/orders.ts`
   Evidence: the handler uses `orderId` but does not validate ownership.
   Recommendation: validate tenant access before retrieving or mutating the order.

2. [Medium] No negative permission test
   Recommendation: add a test for a user without access to the tenant.

Local checks:
- tests: not run
- lint: not run
- secrets scan: no obvious secret detected

Operating Model

Discovery mode

First engagement days: ingest backlog, map the codebase, identify owners, generate glossary, and create an initial risk radar.

Delivery mode

Recurring work: PR review, daily brief, blocker detection, change summaries, and refinement preparation.

Consulting mode

Diagnostics, recommendations, improvement plans, and internal material for knowledge-sharing sessions.

Governance mode

Usage audit, model cost, connected clients, permissions, endpoints, and privacy checks.

Implementation Roadmap

Phase	Goal	Deliverables	Success criteria
0. Spike	Validate installation and connectivity on Windows 11.	Dedicated WSL2 environment, OpenClaw installed, enterprise LLM configured.	A simple task runs without personal providers.
1. Bootstrap	Install the generic Delivery Twin capability.	Repo cloned, local config, doctor command, DevOps/Slack/LLM login.	`delivery-twin doctor` passes critical checks.
2. Attach client	Connect a repository and board.	`attach` command, clone, client registry, initial cache.	Backlog and repository are visible locally.
3. Ingest	Generate the first useful client view.	Backlog ingestion, repo index, technical map, risk radar.	A consultant understands the engagement in under 30 minutes.
4. PR review	Automate first-pass PR review.	Review CLI, local/draft output, escalation policy.	Useful findings without excessive cost or noise.
5. Rollout	Turn the capability into a maintained internal product.	Signed installer, update process, onboarding guide, audit trail.	Another consultant can install it without direct help.

Risks and Mitigations

Risk	Impact	Mitigation
Wrong LLM provider used by accident	Client data leaves the approved route.	Ship without personal providers, validate config in `doctor`, and fail closed.
Generic repo mixed with client context	Leaks, divergence, and hard maintenance.	Enforce separate `core/` and `clients/CLIENT/` workspace roots.
Noisy PR review	Teams stop reading the output.	Use strict severity, concrete evidence, and draft mode while calibrating.
LLM cost grows unchecked	Internal rejection due to cost or latency.	Mini model by default, context budgets, caching, deduplication, and metrics.
Slack posts sensitive content	Client information appears in the wrong channel.	Allowlisted channels, human confirmation, summaries instead of raw code.

Implementation Checklist

Technical

Choose the internal package name and CLI name.
Create the generic repository structure: installer/, skills/, plugins/, prompts/, policies/, docs/.
Implement the installer and preflight checks.
Install pinned upstream OpenClaw and register the Delivery Twin pack.
Configure the enterprise LLM as the only provider.
Implement delivery-twin doctor.
Implement delivery-twin attach for repository and board connection.
Implement Azure Boards and repository ingestion.
Generate the first local client brief.
Implement PR review in draft mode.

Organizational

Confirm the policy for sending client context to the enterprise LLM.
Define ownership of the generic Delivery Twin repository.
Define Azure DevOps scopes for read-only and write-enabled modes.
Define Slack channel allowlists and posting rules.
Prepare an internal AI-Day where teams can share what they are building and learning.
Create a 30-minute consultant onboarding guide.

Target story: install the distribution, connect a client, understand backlog and code structure, review a PR, and share a confirmed summary in under one hour.