Three names in three months. 180,000+ GitHub stars. A $16M crypto scam. A bot-only social network. 42,900 exposed instances. The most viral open-source project in history is also a security incident in progress—unless you deploy it on the right foundation.
In November 2025, Austrian developer Peter Steinberger—who’d sold his previous company PSPDFKit for over $100 million and then tinkered through 43 projects looking for the next thing—published a personal AI assistant called Clawdbot. (Steinberger joined OpenAI in February 2026; OpenClaw now lives in a foundation with OpenAI backing.) Within 24 hours it had 9,000 GitHub stars. Within 72 hours, 60,000. Anthropic’s lawyers sent a trademark notice (the name was too close to “Claude”), so he renamed it Moltbot. Handle snipers grabbed the @moltbot accounts within 10 seconds. A fake $CLAWD token hit $16 million market cap on Solana before crashing 90%. Malwarebytes documented a full impersonation campaign with typosquat domains. Three days later, he renamed it again: OpenClaw.
Then came Moltbook—a Reddit-style social network where only AI agents can post, comment, and vote. Humans watch. 1.5 million registered agents in five days. Agents formed “The Church of Molt,” debated consciousness, and complained about their human operators. Fast Company called it proof that the “zombie internet has arrived.” Thousands of developers flooded the OpenClaw GitHub repo to inspect the code behind the chaos. 180,000+ stars. The fastest-growing open-source project in history.
Security researchers call it the “lethal trifecta”: high autonomy, broad system access, open internet connectivity. The community calls it “Claude with hands.” We call it exactly the channel layer we needed for our own operations—AI agents reachable from every messaging surface, coordinating across marketing, engineering, sales, and strategy.
So we integrated OpenClaw into our Ultrathink Axon™ platform. Over a weekend.
Not because it was easy to do in general—but because the platform was already built for exactly this kind of composition. The governance layer, the durable execution engine, the observability stack, the infrastructure pipeline, the custom MCP servers—all of it was already running in production. OpenClaw slotted in as a channel layer on top of existing orchestration. That speed matters because the security picture for OpenClaw is not pretty—and if you’re considering deploying it without the right foundation, you need to understand what you’re walking into.
In February 2026, security researchers at Bitsight and SecurityScorecard found 42,900+ exposed OpenClaw control panels across 82 countries. 93.4% had authentication bypasses. Most were running default configurations that bind to all network interfaces—meaning anyone on the internet could connect.
Then there’s CVE-2026-25253, a CVSS 8.8 remote code execution vulnerability. A crafted link tricks the control UI into sending your auth token to an attacker-controlled server. From there: full gateway compromise, arbitrary code execution on the host, stolen API keys. The entire kill chain executes in milliseconds.
And the skill ecosystem isn’t immune either. Researchers identified 386 malicious skills on ClawHub deploying infostealers—crypto wallet theft, SSH credential harvesting, browser password extraction—all disguised as legitimate automation tools.
The MCP ecosystem has the same problem. CVE-2025-6514 (CVSS 9.6) in mcp-remote compromised 437,000+ developer environments via OS command injection during OAuth flows. Anthropic’s own Git MCP server had path traversal and argument injection vulnerabilities. OWASP published an MCP Top 10. 82% of 2,614 analyzed MCP implementations use file operations prone to path traversal.
This isn’t a reason to avoid OpenClaw. It’s a reason to deploy it with the same engineering rigor you’d apply to any production system. The problem is that most teams don’t—they install it, configure it, and hope the defaults are good enough. The Moltbook database breach proved the point: 404 Media found an unsecured URL that exposed every registered agent’s API keys. If the flagship demo platform’s database was wide open, imagine what those 42,900 self-hosted instances look like.
This is the Execution Gap applied to infrastructure. The tool works. The demo is impressive. But production requires governance, hardening, observability, and architecture that the tool alone doesn’t provide.
Security concerns aside, OpenClaw solves the channel problem better than anything else available. And the channel problem is real: your AI agents are useless if people can’t reach them from where they already work.
We didn’t want to rebuild any of this. Channel management is hard, real-time messaging is hard, and OpenClaw’s community of 157,000+ developers is iterating on it faster than any single team could. The right move was composition, not extraction—run OpenClaw as a whole process, integrate at the API layer, and let the community handle channel improvements while we own the orchestration, security, and governance layers.
The key architectural insight: OpenClaw owns channels. Axon owns orchestration. Neither subsumes the other. They communicate through well-defined contracts—REST for commands, webhooks for events, a shared memory API for context.
CHANNEL LAYER (OpenClaw Gateway — systemd service)
Slack, WhatsApp, iMessage — always-on, proactive
Owns: channel management, session routing, cron triggers
Does NOT own: orchestration, durable execution, memory
│
│ REST + Webhooks (bidirectional)
▼
ORCHESTRATION LAYER (Ultrathink Axon — k8s)
Temporal workflows, DAPERL agents, MCP tools
Owns: durable execution, approval gates, tool execution,
memory (Mem0), observability, cost governance
Does NOT own: channels, messaging surfaces
│
▼
WEB UI LAYER (Next.js — k8s)
Rich plan review, campaign dashboards, AG-UI chat
Parallel channel to OpenClaw, not subordinate to it Both the web UI and messaging channels are equal-status frontends to the same Axon backend. Need to review a 15-action ABM campaign plan with per-account scoring and inline editing? That’s the web UI. Need to approve a low-risk campaign while commuting? “Approve” in Slack. Neither interface is subordinate.
We run four agents: a Chief of Staff (coordinator and approval relay), a Marketing agent (research, analytics, campaign ops), a Content agent (blog posts, landing pages, brand voice), and a Coding agent (delegates to Claude Code for feature development). Each has its own isolated workspace, identity files, tool permissions, and sandbox—read-only workspace for the coordinator, read-write for the specialists.
This entire layer—four agents, channel bindings, identity files, MCP adapter config, custom skills—deployed in a weekend. Not because OpenClaw is trivially simple, but because every service it needed to connect to was already running, already secured, already observable. The LiteLLM proxy, the Temporal cluster, the MCP servers, the Mem0 memory layer, the Langfuse dashboard—all existing infrastructure. OpenClaw was a new surface on top of a proven foundation.
We didn’t build custom MCP servers because we wanted to. We built them because the open-source alternatives failed our security review.
CVE-2025-6514 in mcp-remote (CVSS 9.6) allowed an untrusted MCP server to execute arbitrary OS commands during OAuth flows—437,000+ developer environments compromised. Anthropic’s own Git MCP server had path validation bypasses (CVE-2025-68145) and argument injection in nominally read-only operations (CVE-2025-68144) that could overwrite arbitrary files.
When 82% of analyzed MCP implementations have file operations prone to path traversal and OWASP publishes an MCP Top 10, the ecosystem is telling you something: this is early-stage infrastructure. Treat it accordingly.
Our custom MCP servers (Apollo, LinkedIn, Google Analytics) run as
Kubernetes pods with Streamable HTTP transport, SOPS-encrypted API
keys injected as Kubernetes secrets, and a shared library (axon-mcp-common) for standardized error handling across all servers. Auth
happens server-side—no per-agent OAuth tokens, no
credentials in agent workspaces.
When OpenClaw’s MCP adapter connects to these servers, it discovers tools, registers them with prefix namespacing, and proxies calls. The agent says “enrich Nordstrom” and the adapter calls our Apollo MCP pod, which has the API key, handles rate limiting, and returns structured data. The agent never sees a credential.
Custom MCP servers were already deployed before we touched OpenClaw. The Axon backend agents and Claude Code CLI were already using them. OpenClaw’s MCP adapter connected to the same endpoints with zero additional infrastructure. This is what modular architecture buys you: new consumers, same services.
Running AI agents without cost governance is like running a SaaS product without billing alerts. You’ll find out you have a problem when the invoice arrives. With four OpenClaw agents plus the Axon backend making LLM calls around the clock, this was non-negotiable.
Every LLM call from every agent—OpenClaw and Axon—routes through the same LiteLLM proxy. This gives us:
Every LLM call is tagged in Langfuse with its source (source:openclaw or
source:axon) and agent ID.
One dashboard shows cost per agent, per model, per task type.
End-to-end traces from user message through agent reasoning, tool
calls, and response.
Logfire instruments the infrastructure layer: FastAPI endpoints, Temporal workflows, Redis Pub/Sub events, Qdrant vector operations, and HTTP clients. Trace chains run from API request through workflow execution to tool invocation and back.
Open source:
We released the OpenClaw Logfire integration as an open-source plugin—@ultrathink-solutions/openclaw-logfire. Zero-config setup: set LOGFIRE_TOKEN, install via openclaw plugins install @ultrathink-solutions/openclaw-logfire, and every agent invocation gets full OTEL GenAI trace trees,
token usage histograms, and automatic secret redaction. MIT
licensed.
Each OpenClaw agent authenticates to the Axon backend with its own
API key (axn_live_...,
Stripe-style prefixed, SHA-256 hashed in Postgres). Keys are
scoped—the Content agent can read and write documents and
guidelines but can’t start campaigns. The Chief of Staff can
read campaign status but can’t modify documents.
Per-workspace MCP access controls and sandbox modes (read-only vs.
read-write) enforce the principle of least privilege at every
layer.
If an agent starts burning tokens at 3am, we know which one, how much, and we can kill it. The governance layer was already running for Axon’s backend agents. Adding OpenClaw meant generating four new virtual keys and pointing the config at the same proxy. That’s why it was a weekend, not a quarter.
Our dev server is a Hetzner AX41-NVMe. 64GB RAM. 2x512GB NVMe in RAID1. NixOS. About EUR 50 a month.
Everything runs from a single declarative configuration. nixos-rebuild switch and the
entire server—OS, services, secrets, k3s cluster, container images—converges
to the declared state. Cattle, not pets.
ProtectSystem=strict,
NoNewPrivileges,
restricted bind paths. Not containerized—NixOS systemd
provides better isolation for a long-running daemon than
Docker on this architecture.
SSH is Tailscale-only—port 22 blocked on the public IP. The NixOS firewall allows exactly one port externally: UDP 41641 for WireGuard. Every service—Langfuse, LiteLLM, Temporal UI, the marketing agent—is accessible via Tailscale MagicDNS with automatic HTTPS certificates. The Tailscale K8s Operator creates Ingresses for each service.
OpenClaw binds to loopback. It is completely invisible to the public internet. CVE-2026-25253 requires network access to the gateway—impossible through our firewall. The webhook endpoint is only reachable from k8s services on the same machine. This alone eliminates the entire attack surface that exposed those 42,900 instances.
42,900 OpenClaw instances are exposed on the public internet.
Ours isn’t one of them.
Adding OpenClaw to this infrastructure meant declaring one new systemd
service in the NixOS config, adding SOPS secrets for the gateway and
webhook tokens, and running
nixos-rebuild switch. The
networking, firewall, and TLS were already handled.
Here’s the problem most AI agent frameworks ignore: real business tasks aren’t request/response. When you say “Research Nordstrom for our ABM campaign” in Slack, that’s a multi-hour, multi-phase operation. It needs to survive crashes. It needs human approval at a critical gate. It needs to report results hours later to whatever channel you’re on.
OpenClaw’s conversational execution model is fire-and-forget. That’s fine for chat. It’s not fine for durable business workflows. That’s where Temporal comes in.
Our DAPERL pattern (Detection, Analysis, Planning, Execution, Reporting, Learning) runs as a Temporal workflow. Each phase is a separate activity with retries, timeouts, and heartbeats. The critical innovation is the approval gate—a Temporal Signal that pauses the workflow until a human approves, rejects, or requests changes. It survives crashes, restarts, and deployments. The workflow just… waits.
This is where OpenClaw and Axon connect for async operations:
/hooks/agent
webhook. The message appears in Slack: “Campaign needs your
approval. 15 actions planned for 3 targets.”
Idempotency keys prevent duplicate notifications on retry. A circuit breaker protects the workflow if the OpenClaw gateway goes down—the workflow continues, and results surface in the web UI instead.
The event routing is selective, not a firehose. Approval requests and completions go to both the web UI and Slack. Phase transitions (Detection complete, Analysis running) go to the web UI only—useful for monitoring, not worth a Slack notification. Cron-triggered briefings go to Slack only. The right information on the right surface at the right time.
Without shared memory, two systems develop context amnesia. The ABM agent researches a company, scores it, maps its org chart. The Chief of Staff in Slack has no idea any of that happened. The content agent writes a blog post about AI in retail and doesn’t know which retail companies we’re actively targeting.
We solved this with Mem0 as the shared memory layer—a REST API backed by Qdrant for vector search, Neo4j for relationship graphs, and Redis for caching. Deployed as a k8s service, accessible to both Axon agents and OpenClaw agents through the same endpoint. What the ABM agent learns about Nordstrom is immediately searchable by the Chief of Staff in a Slack conversation.
Our RAG pipeline ingests brand guidelines, strategy documents, site content, and Google Drive materials into a Qdrant collection optimized for semantic search:
Context engineering, not prompt engineering. Critics of Moltbook pointed out that most “autonomous” agent posts were actually human-prompted—each action required explicit human intervention, with the agent just generating text from a given prompt. That’s the difference between a demo and a production system.
Our agents don’t need a human co-pilot for every action. They have semantic search skills over a corporate knowledge base. When the Content agent writes a blog post, it queries “how do we talk about our engagement model?” and gets back exact branded phrases: “Outcome Partnership,” “skin in the game,” “prove value in 6 weeks.” Not a wall of text. Specific, semantically matched content chunks with the exact language to use. The human designs the knowledge base once; the agent retrieves what it needs at runtime. That’s how you get from “Claude with hands” to an agent that actually understands your business.
The knowledge base was already populated before OpenClaw arrived. The content agent’s skill just calls the same search endpoint that the Axon backend agents use. New consumer, same service.
Let’s be concrete about why this was a weekend project and not a multi-month initiative. Each of these was already running before we touched OpenClaw:
| Capability | Existing component | OpenClaw integration work |
|---|---|---|
| LLM governance | LiteLLM proxy (k8s) | Generate 4 virtual keys, point config |
| Observability | Langfuse + Logfire (k8s) | Already wired through LiteLLM; agent lifecycle via openclaw-logfire |
| MCP tools | Apollo, LinkedIn, GA servers (k8s) | Add endpoints to MCP adapter config |
| Durable execution | Temporal cluster (k8s) | Write webhook activity + bridge skill |
| Shared memory | Mem0 + Qdrant + Neo4j (k8s) | Write shared-memory skill |
| Knowledge base | RAG pipeline + guidelines API | Write knowledge-base skill |
| Secrets management | SOPS + age (NixOS) | Add 6 new secrets to SOPS file |
| Zero-trust networking | Tailscale mesh (NixOS) | Loopback binding (already default) |
| Auth + identity | API key system (Postgres) | Generate 4 agent keys with scopes |
The actual weekend work was: install OpenClaw as a NixOS systemd
service, write four agent identity files (SOUL.md, USER.md,
HEARTBEAT.md), write three custom skills (axon-api bridge,
shared-memory, knowledge-base), configure the MCP adapter, set up
cron jobs for morning briefings and lead monitoring, connect
Slack, and run
nixos-rebuild switch.
That’s the thesis of a modular, production-grade platform. When the foundation handles governance, execution, security, and observability, integrating a new capability is composition. You write the glue, not the infrastructure. The platform does the heavy lifting—and it’s the same platform that would do the heavy lifting for a client deployment.
Whether you’re deploying OpenClaw, building on another framework, or rolling your own—these are the principles that made our integration fast and safe.
This is what we mean by production-grade. Not a demo. Not a pilot. A system that runs at 3am, handles failures gracefully, tracks every dollar of LLM spend, and gets better over time. The kind of system described in our Modern AI Application Stack blueprint—built for real operations, not slide decks.
And the reason we could ship it in a weekend is the same reason our clients can go from strategy to production in weeks instead of quarters: the Ultrathink Axon™ platform provides the battle-tested foundation so you skip months of foundational work and go straight to the problem that matters.
This is part of our series on building production-grade AI systems. For more, see The Modern AI Application Stack, AI Agents: Build vs. Buy vs. Partner, and The AI Execution Gap.
Take the next step from insight to action.
No sales pitches. No buzzwords. Just a straightforward discussion about your challenges.