Architecture

How Kantai OSS is built — Kubernetes-native, zero-trust, GitOps-driven, and designed for self-hosted privacy.

Platform Overview

Kantai runs entirely within your Kubernetes cluster. No external dependencies, no phone-home, no vendor lock-in.

Core Components

  • Gangway — Web portal (Dashboard, Chat, Nava, Learn, Wiki, Governance, Tetraban)
  • OpenClaw Runtime — Agent execution engine (workspaces, memory, tool dispatch)
  • Agent Pods — One pod per agent (Sencho, Takumi, Bannin, custom)
  • PostgreSQL — Conversation history, memory, configuration
  • Redis — Session cache, pub/sub for inter-agent messaging

Deployment Model

  • Deploys to any Kubernetes cluster (AKS, EKS, GKE, k3s, bare metal)
  • Single namespace (kantai) with all resources
  • Helm chart for install, upgrade, and configuration
  • Flux GitOps compatible — store values in Git, Flux reconciles
  • Rolling updates with zero-downtime agent restarts

Security Model

Zero-trust by default. Every component authenticates, every request is authorized, every action is logged.

Namespace Isolation

All Kantai resources live in a dedicated namespace. Network policies restrict ingress/egress to only what’s needed. Agents cannot reach other namespaces or the host network.

Network Policies

Default-deny ingress. Gangway accepts traffic on 443. Agents communicate only via internal services. Egress is allowed to configured LLM API endpoints and messaging providers — nothing else.

RBAC

Each agent pod runs with a dedicated service account. Permissions are scoped to the minimum required. No cluster-admin, no wildcard rules.

Pod Security

  • Non-root containers with read-only root filesystems
  • Security contexts enforce runAsNonRoot, allowPrivilegeEscalation: false
  • Resource limits prevent runaway processes
  • Pod disruption budgets for availability during upgrades

Secret Management

Secrets never live in plain text. Kantai integrates with external secret stores for production-grade key management.

  • LLM API keys, database credentials, and channel tokens stored in Key Vault
  • Mounted into pods via the CSI Secrets Store Driver
  • Automatic rotation — pods pick up new secrets without restart
  • Audit logging via Azure Monitor

Kubernetes Secrets (Default)

  • Works out of the box, no external dependencies
  • Secrets encrypted at rest (enable EncryptionConfiguration on your cluster)
  • Suitable for development and small deployments
  • Upgrade path to Key Vault when ready

Backup Strategy

Your data is valuable. Kantai includes a backup strategy that covers all persistent state.

What’s Backed Up

  • PostgreSQL — pg_dump on schedule, stored in blob storage
  • Agent workspaces — PVC snapshots via volume snapshots or rsync to blob
  • Configuration — Helm values in Git (GitOps), Governance settings in DB
  • Nava knowledge — included in PostgreSQL backups

Recovery

  • RPO: 1 hour (configurable, down to 5 minutes)
  • RTO: 15 minutes for full restore from latest backup
  • Recovery runbook included in the Helm chart docs
  • Azure Recovery Services Vault (RSV) integration for managed AKS clusters
  • Cross-region backup support for disaster recovery

Identity

Kantai uses modern identity patterns — no shared secrets, no static credentials where avoidable.

Workload Identity

Agent pods authenticate to cloud services using Kubernetes workload identity. No API keys for Azure/AWS/GCP services — the platform handles token exchange.

Managed Identity

For AKS deployments, managed identity provides passwordless access to Key Vault, blob storage, and other Azure resources. Zero credential management.

OIDC

Gangway supports OIDC for user authentication. Connect to your identity provider (Entra ID, Okta, Keycloak, etc.) for SSO and MFA.

Observability

You can’t manage what you can’t see. Kantai exposes metrics, logs, and traces for full-stack visibility.

Metrics

Prometheus-format metrics from every component. Agent token usage, request latency, task completion rates, error counts. Grafana dashboards included in the Helm chart.

Logging

Structured JSON logs from all pods. Compatible with any log aggregator (Loki, ELK, Fluentd, Azure Monitor). Log levels configurable per component.

Alerting

Pre-configured alert rules for: agent down, high error rate, token budget exceeded, backup failure, certificate expiry. Sends to PagerDuty, Slack, email, or webhooks.

Multi-Tenancy

Run multiple isolated fleets on the same cluster.

  • Namespace per tenant — each fleet gets its own namespace with full isolation
  • Network policies — tenants cannot communicate unless explicitly allowed
  • Resource quotas — enforce CPU, memory, and storage limits per tenant
  • Separate databases — each tenant gets its own PostgreSQL instance or schema
  • RBAC — tenant admins can only see and manage their own fleet

Multi-tenancy is ideal for organizations running Kantai for multiple teams, or for service providers offering managed Kantai fleets.


Want this managed for you? pemos.ca runs Kantai on hardened AKS infrastructure with automated backups, monitoring, and support — so you don’t have to manage the platform yourself.

Master Control — Enterprise Pattern

Master Control is the enterprise deployment pattern built on Kantai OSS. It extends the core platform with dedicated principals — specialized command roles that oversee crews of agents.

How OSS Feeds Master Control

  • Gangway portal becomes the bridge UI for each principal
  • Agent pods are organized into crews (Security crew, Solutions crew)
  • Governance controls power compliance scoring and OCO policies
  • Tetraban serves as the enterprise kill switch across all crews
  • OIDC & identity enable principal-level access control

Two Principals

  • 🎖️ Security Principal (CISO’s Bridge) — Officer, Watchkeeper, Quartermaster
  • 🔧 Solutions Principal (CTO’s Bridge) — Captain, Bosun, Helmsman, Botwright
  • 📡 Signals — Cross-principal communication and alerting

Each principal commands a dedicated crew with scoped permissions, dashboards, and workflows — all built on the same Kantai components you self-host today.

Master Control managed service → pemos.ca/master-control — enterprise-grade, fully operated by EOSE.