The mental map of OpenClaw

OpenClaw starts from a stubborn observation. Most people are already talking to software all day, just not to AI. Their WhatsApp is running, their Telegram is buzzing, their Slack is open. There they discuss projects, give answers, send photos, lock in appointments. An AI assistant that lives in a separate browser tab and waits there until they show up, is structurally on the outside. It misses the context of what they were just doing, it misses the continuity of an ongoing conversation, it misses the place where the work happens.

01 · CORE INTUITIONA colleague who lives in your channels

The core intuition behind OpenClaw is that a personal AI has to live where the user already lives. Inside his messaging apps, inside his phone call, inside his voice memo. With access to the material he is collecting anyway. On his own machine, so the data does not first travel to a vendor before he can talk about it.

That intuition explains the entire architecture. It explains why there is a separate layer for channels, why there is a central gateway, why workspaces live as folders on disk, why a heartbeat exists so the agent has its own time, why skills are extensible from conversation. Everything that follows in the fieldguide is a working out of one thought: a personal AI should become a colleague who lives in your working environment, not a tool you open.

02 · THREE LAYERSWhy the separation between channel, brain and source

OpenClaw splits itself into three layers: Interfaces, Gateway and Services. At first glance that looks like a standard three-layer architecture. It gets interesting as soon as you ask what would go wrong if you merged one layer into another.

The Interfaces layer exists because people talk to software in different ways. Someone on a phone wants to send voice memos from WhatsApp. Someone at a desk wants to type into the macOS menu bar. Someone over SSH wants a TUI. Without a separate interfaces layer, OpenClaw would have to pick which modality it supports, and exclude all others. By isolating the layer, each channel can have its own ergonomics, while what happens below is identical everywhere.

The Gateway layer exists because otherwise every channel would have to do its own authentication, session management, routing and agent orchestration. Twenty channels would then contain the same logic twenty times, with twenty ways for it to work slightly differently. The Gateway is the frugal choice: one process that conducts everything, one place where the rules live, one door that leads to the outside world.

The Services layer exists because OpenClaw is not an LLM, not a messaging platform, not a browser engine. It uses those things. By treating them as external services, OpenClaw can switch models without reworking itself, switch messaging providers, switch browser implementations. The services layer is the place where OpenClaw says: this is not mine, I only work with it.

The freedom each layer gives to the one above is exactly what makes the architecture durable. The interfaces do not need to know anything about LLM providers, the gateway does not need to know anything about Discord’s gateway protocol, the services do not need to know anything about the user. Each layer thinks in its own language.

CONCEPTUAL DEPENDENCY, NOT MESSAGE FLOW

The arrows show dependency, not the direction a message travels. A message naturally travels both ways. But conceptually the gateway sits above the services and the interfaces above the gateway, in the sense that higher layers may know about lower ones but not the other way around.

03 · PRIMARY ABSTRACTIONSThe words OpenClaw uses to think about itself

OpenClaw has eleven primary abstractions. None of them was chosen for encyclopedic completeness; together they form the minimal vocabulary the system needs to describe itself. Anyone who has these eleven in mind can leaf through the fieldguide and always know where the text is.

Claw

An individual agent with its own personality, memory and channel connections. One gateway can run multiple claws at once.

Metaphor: a household help who knows your stuff, not your neighbour's.

Gateway

The central Node.js process that orchestrates all channels, sessions and agent runs. One gateway per machine; the claws live inside it.

Metaphor: the telephone exchange that ties all lines together.

Daemon

The gateway when it runs as a background service, so you do not have to keep a terminal open. On macOS via launchd, on Linux via systemd, on Mac also via the menu bar app.

Metaphor: the telephone exchange that works at night too, without anyone present.

Workspace

The folder of markdown files that gives a claw its personality and memory. By default under ~/.openclaw/workspace/. Each claw has one workspace.

Metaphor: his brain on disk, readable with a text editor.

Channel

A connection to a messaging platform or a direct interface. One claw can be present on multiple channels at once, and one channel can serve multiple claws.

Metaphor: a line over which conversations can run, regardless of who is on which end.

Session

An ongoing conversation between you and a claw, with memory between turns. Sessions are tied to a combination of channel plus peer; different chats are different sessions.

Metaphor: a running thread to which new turns attach themselves.

Skill

A reusable capability, stored as a folder with a SKILL.md inside. The claw knows when it can call a skill, based on what is written in that markdown.

Metaphor: a craftsman's toolkit where each tool itself tells you what it is for.

Tool

A primitive action the claw can perform: read a file, open a web page, run a bash command. Skills are built from tools.

Metaphor: a screwdriver, not worth much on its own, indispensable together with other tools.

Heartbeat

An internal timer that wakes the claw periodically without you sending anything. This lets it proactively maintain memory, send reminders, review running tasks.

Metaphor: the heartbeat, inaudible when all is well, indispensable for the fact of life.

Soul

The SOUL.md file in the workspace that captures the claw's personality and values. Loaded as system context at the start of every session.

Metaphor: character, reasonably stable, not entirely unchangeable but rarely revised.

Memory

Everything the claw remembers between sessions: short term (MEMORY.md), daily notes, and optionally a vector index for semantic search.

Metaphor: a craftsman's memory, built up through experience, accessible through context.

What sharpens the thinking model: Soul and Memory are deliberately separated. Personality is what the claw IS; memory is what it has lived through. One layer rarely changes, the other grows daily. By modelling them as separate files they can evolve at different paces, and you can reset one without touching the other. That is not a technical detail, it is an underlying view of what a person is.

04 · THE RHYTHM OF LIFEWhy OpenClaw is not a reactive service

A normal API service waits for a request, does its work, then goes idle. Between requests it is effectively non-existent. OpenClaw does this differently. The system has its own rhythm that continues whether or not you send anything.

Why does OpenClaw choose that? Because an agent that only lives when spoken to is not an agent. It is a function. A colleague who only does something when you direct him is an executor. A colleague who continues thinking on his own, maintains his own memories, prepares his own morning, is someone alongside you. That shift from function to someone is what the rhythm of life makes architecturally possible.

Four mechanisms together form that rhythm.

The heartbeat

The heartbeat gives the claw its own time. Every X minutes, within active hours you configure, it wakes up and reads its HEARTBEAT.md. There it finds what it should do in its own time: summarise memory, check reminders, review running projects. If there is nothing, it stays silent. The important thing is not what it does at each heartbeat, the important thing is that it can. That lets an agent carry responsibility for something that is not hanging on you.

Sessions as unit of coherence

The session, not the individual message, is the unit of coherence. That is a choice. Per message would mean every reply starts from zero context and supplies all previous turns as a box of loose sentences to the LLM. Per session means a conversation builds up and the claw can see the whole build at every turn. Sessions are therefore tied to a specific combination of channel and peer: your conversation in Telegram is a different session than your conversation in WhatsApp, even when the same claw talks to the same you.

Hooks as a gate for the outside world

The world goes on without the claw. An email arrives, a Gmail Pub/Sub pings, a webhook fires. Without hooks the claw would only know what you tell it. With hooks the outside world can enter the rhythm of the claw, and the claw can respond as if it were a conversation. Important: these external events are marked as untrusted. The claw knows they come from outside, it treats them with the same caution as a message from a stranger.

Cron as a built in scheduler

Why not outsource cron to an external scheduler? Because an external scheduler does not know which session belongs to it, which skills are relevant, how much context should be available, which model may be used. The claw knows all of that. By modelling cron internally, a scheduled task can have the same richness as a manual conversation. A job can run in a new isolated session or in an existing running thread, with heavier or lighter models, with fallback paths, with announcement policies. An external cron daemon cannot do that.

Heartbeat, sessions, hooks and cron together form not a series of features but a view of what time is for an agent. Time is not a waiting room between requests. Time is material the agent can work with.

05 · WHERE THE MAGIC SITSFour design choices that go beyond the functionally required

A lot of OpenClaw is just good engineering. A few things stand out: choices you would not encounter in a less considered runtime. Those places are where the author wanted something more than just working.

The Soul and Memory split
Personality and experience live in separate files, with different lifecycles. A runtime that had pinched these two together would, at every memory flush, get an implicit personality drift. By separating them, the claw stays itself while it grows. That is not optimization, that is a view of identity.
The heartbeat as architectural element, not as feature
Many systems have a cron job somewhere. OpenClaw has the concept of own time baked in at a low level, with activeHours, target, its own model choice and HEARTBEAT_OK silence. What makes this different: the heartbeat is not meant to prove something. It is meant to exist quietly until it has something to say.
Channel bound state as a safety boundary
What is discussed in WhatsApp stays in WhatsApp. What happens in Telegram does not leak to Slack. That sounds obvious but it is a choice: a runtime that had put everything into one memory would have been easier to build. By choosing per channel peer as session scope, OpenClaw protects the user against unintended context mixing that would occur in other systems.
The pragmatic Docker non-main mode
Sandboxing costs cold start seconds. A dogmatic runtime would have put everything in the sandbox, or nothing. OpenClaw picks a middle path: the main session where you talk to the claw yourself stays fast, spawned subagents (which work with unknown data) automatically go into a container. That is not a compromise, that is a trade off at the right level, based on where the risk actually sits.

What these four points have in common: they answer a question the architecture could have avoided. A less careful system would have left those questions open. OpenClaw answered them, and put the answers into code.

06 · WHERE THE SEAMS SHOWTensions the fieldguide does not entirely hide

An honest mental map also points out where the seams are visible. OpenClaw is a living system; some parts feel finished, others feel as if work is still in progress.

The WhatsApp connection. The ban risk warning in Part 3 is more than a disclaimer; it is an acknowledgement that OpenClaw’s primary channel for many users runs over a reverse engineered protocol that goes against WhatsApp’s terms of service. For the user that means a perpetual tension between I want my AI in WhatsApp and I could lose my number. For OpenClaw itself it means an important use case rests on infrastructural shakiness.

The memory extensions that sit alongside core. Memory wiki, active memory, dreaming, REM backfill, grounded scenes. You read it and you can feel a lot of iteration has happened here. Some parts live in core, others as plugins, others again are introduced as “extension”. The functional value seems present; the mental cleanliness of one memory system is not yet there.

The DM policies that are now centralised. Part 8 mentions that all channels now share one mention policy, “previously each channel had its own rules, which led to confusing differences”. That is a built in acknowledgement of growing pains. It is solved, but the footprint of the old inconsistency will still show up in old configurations.

The fieldguide itself is a snapshot. The introduction says it explicitly: OpenClaw evolves at a high pace, this document describes v2026.5.22, for current documentation see docs.openclaw.ai. That is honest, but it also means a large part of reading the fieldguide becomes a kind of archaeology: what was true on that date, what is different by now?

None of these points is an argument against OpenClaw. They are the places where you feel the system is alive, in a way that may be uncomfortable for the authors to name but that for the reader keeps attention sharp on where it should go.

07 · WHAT STAYS OUTSIDE OPENCLAWThe boundaries the system deliberately draws

OpenClaw does not try to be everything. A few things are deliberately kept out of scope, and those choices explain what it is.

OpenClaw is not an LLM provider. It supports more than a dozen, from Anthropic and OpenAI to local Ollama. But it does not build its own model. By being a provider of providers instead of a model, OpenClaw can evolve along with a market that changes faster than it could ever follow.

OpenClaw is not a messaging platform. It does not bring its own network, its own identity management, its own chat app. It hooks into WhatsApp, Telegram, Discord, Slack and so on. The native iOS and macOS apps are the exception, and they exist because an own interface for direct use turned out to be indispensable. But the collective messaging stays outsider work.

OpenClaw is not a browser engine. Playwright does the heavy lifting, OpenClaw gives the claw the tools to drive Playwright. The ARIA snapshots, the hot reloading of profiles, the SSRF protection, that is on top of a browser that already exists.

OpenClaw is not a marketplace for skills. There is ClawHub as a public registry, but OpenClaw itself does not interfere with who builds or distributes skills. The skill policy validation sits in the gateway, the commercial ecosystem lies around it.

OpenClaw is not orchestration for cloud deployments. It runs on your machine, optionally on a server instead of your laptop. But Kubernetes, multi tenant deployments, enterprise SSO; that is a different game. OpenClaw says: I am personal. What is scalable here is what one person can do with it, not how many people together.

Drawing the lines around what OpenClaw is not gives clarity about what it is: a personal, local, model independent runtime that lives inside your existing digital world, not around it.

08 · THE MENTAL ANCHORFive sentences to carry with you

What you would keep if all the ideas unfolded above shrank into five sentences. Not a summary; five anchors that will keep returning in the fieldguide, and that keep it small along the way.

A claw is not a question and answer tool, it is a colleague that lives in your existing conversations.
The gateway is one Node.js process that conducts everything; the three layers above and below are opt in and swappable.
Personality and memory live as markdown files in a workspace you own, and are deliberately separated.
The heartbeat gives the agent its own time, without it the agent would be nothing more than a function call with language ability.
Running locally is not just a privacy claim, it is an architectural choice that explains why everything around it works the way it does.

Anyone who carries these five sentences can reduce every page of the fieldguide to something he already knows. That is what a mental map should do.