I thought I was debugging a flaky UI.
Turns out I was debugging physics.
A gateway is a universe. And I had accidentally spun up two universes on the same Mac.
The symptom: everything looked like “first time”
At some point, the OpenClaw dashboard started acting like it had never seen me before:
- Sessions missing
- Channels “not linked”
- Tokens mismatching
- A WhatsApp client that had credentials… but only in one place
It felt like the gateway was forgetting.
It wasn’t forgetting.
I was connecting to a different state directory.
The root cause: OPENCLAW_STATE_DIR creates a second brain
OpenClaw stores its identity, configuration, sessions, credentials, and workspaces under a state directory.
On macOS you can easily end up with:
~/.openclaw(canonical)~/.openclaw-agentstack(a second state dir created by a bootstrap script)
Both “work.” But they are not the same gateway.
If you authenticate WhatsApp in one and then start the service using the other, you didn’t lose credentials — you started a different gateway.
The fix was not a magical reset. The fix was an architecture decision:
One gateway identity. One canonical state dir. Everything else is a workspace overlay.
macOS gotcha: “env: node: No such file or directory” over SSH
Another sharp edge: the OpenClaw CLI uses a Node shebang.
In a normal interactive shell, Homebrew’s node is on your PATH.
In a non-interactive SSH command, it often isn’t.
So the CLI dies with:
env: node: No such file or directory
That’s not an OpenClaw bug. It’s a “your PATH isn’t loaded” bug.
The practical rule:
- When running OpenClaw over SSH on macOS, always prefix a Homebrew-safe PATH.
The architecture that actually stays stable
We standardized on what I’ll call Architecture B:
- Canonical gateway state directory:
~/.openclaw - The macOS LaunchAgent runs that gateway
- “AgentStack” becomes a workspace / role layout under the canonical state dir
That means:
- You can have many roles
- You can have many workspaces
- You can run many onboarding flows
…but you never fork your gateway identity by accident.
What I’d do again (and what I’d never do again)
I would do again
- Keep the Control UI bound to localhost
- Use an SSH tunnel for remote administration when needed
- Treat state-dir as a first-class deployment element
I would never do again
- Let bootstrap scripts silently pick a different state dir
- Mix interactive shells and daemon environments without verifying env vars
- “Just try relinking WhatsApp again” in a loop (that’s how you get rate-limited)
A final note (the boring part that saves you)
If you learn one thing from this: write down what your gateway is.
- Which state dir?
- Which port?
- Which process owns it (LaunchAgent vs manual run)?
Most “flaky agents” are really “two brains.”
And you don’t fix that with more retries.
You fix it with one universe.
If you’re setting up OpenClaw on macOS today, I’ll publish the exact Day‑0 checklist and a repair script that migrates legacy installs without touching messaging credentials.
(And yes: there’s an owl involved.)