Agentic Operating System Recommendation: OpenClaw + Second Brain/gbrain + Hermes Pilot
P3 - LowRecommendation to keep OpenClaw as always-on gateway, upgrade memory with gbrain-style synthesis, pilot Hermes as self-improving execution backend, and use Claude/Codex as worker runtimes.
Agentic Operating System Recommendation
Generated: 2026-05-24
Executive Recommendation
Do not do a full migration off OpenClaw right now. Keep OpenClaw as the always-on messaging gateway, project/session router, and human interface. Add two upgrades around it:
- A gbrain-style permanent memory/synthesis layer for company knowledge, project context, decisions, and research.
- A Hermes pilot as a sandboxed self-improving execution engine for repeatable workflows, not as the primary operating system yet.
Use Claude Code and Codex native environments as specialist worker runtimes for software engineering, code review, QA, and security, but do not make either one the whole company operating system. They are excellent workbenches; they are not the best always-on multi-channel chief-of-staff layer.
The target architecture should be:
Henry -> Discord/Telegram -> OpenClaw Gateway -> Bob as orchestrator -> project-scoped PM agents -> specialist execution workers -> Second Brain/gbrain-style durable memory -> Mission Control task board
Hermes should sit beside this as a learning execution backend:
OpenClaw routes selected repeatable tasks -> Hermes executes/learns/updates skills -> result + lessons saved back to Second Brain.
Why
Henry's actual goal is not "better chat with AI." The goal is to run a one-person startup studio capable of launching several products and building toward $100M enterprise value in five years. That needs:
- Always-on access from the channels Henry actually uses.
- Project isolation so MedSchools.ai, Hedge, WiderWings, CollegeDojo, and real estate never contaminate context or infrastructure.
- Durable company memory, not fragile chat transcripts.
- Autonomous delegation with review gates.
- A repeatable operating cadence: think, plan, build, review, test, ship, reflect.
- Security boundaries around tools, databases, browser access, and secrets.
- Measurable agent performance.
OpenClaw is already closest to the always-on company OS. Hermes is promising for self-improvement. Claude Code/Codex are strongest as coding/runtime workers. gstack/gbrain provide the missing discipline and memory pattern.
Findings
gstack
Garry Tan's gstack is best understood as process discipline, not a platform. Its value is the pipeline: product thinking -> planning -> engineering review -> build -> review -> QA -> ship -> retro. The important pattern is that every step produces an artifact the next step reads.
This should be copied into our operating process, and we already started doing that in AGENTS.md.
Source: https://github.com/garrytan/gstack
gbrain
GBrain is the more important idea for our company setup. It turns memory from "search found pages" into "synthesized answer with citations, freshness checks, and gap analysis." It can run locally with PGLite, at scale with Postgres/Supabase, and expose tools over MCP.
This maps directly to our Second Brain. We do not necessarily need to replace Second Brain with gbrain. We need to make Second Brain behave more like gbrain:
- Every important answer has citations back to source memories/docs.
- Every answer includes "what I do not know yet."
- Every project has isolated memory namespaces.
- Agents save research, decisions, specs, and lessons automatically.
- Repeated workflows become skills/playbooks.
Source: https://github.com/garrytan/gbrain
OpenClaw
OpenClaw is strongest as the self-hosted multi-channel gateway and agent control plane. Official docs describe it as a gateway across Discord, Telegram, Slack, Signal, WhatsApp, iMessage, etc., with sessions, memory, tools, and multi-agent routing. Its GitHub README highlights local-first gateway, multi-channel inbox, multi-agent routing, first-class tools, cron, sessions, and sandboxing options.
For Henry, this matters because Discord/Telegram access and always-on heartbeats are part of the actual workflow. Claude Code alone does not replace that.
Weaknesses: configuration complexity, security risk if exposed or over-permissioned, plaintext secret hygiene risk, and less native self-improvement than Hermes.
Sources:
Hermes
Hermes is attractive because its core pitch is the built-in learning loop: it creates skills from experience, improves them during use, searches past conversations, persists knowledge, and can migrate OpenClaw settings/memory/skills. Current GitHub docs also describe messaging gateway support, cron, memory, MCP, tools, and OpenClaw migration.
The best use is a pilot for repetitive operations:
- recurring research workflows
- SEO/content production flows
- lead research
- QA regression playbooks
- deployment runbooks
- project health checks
I would not put Hermes in charge of the whole company yet. It is younger and the "self-improving" loop needs evals and review gates before we trust it on production business operations.
Sources:
Claude Code native
Claude Code is very strong for coding work. Official docs now support custom subagents, background execution, scoped MCP servers, skills preloading, hooks, and persistent memory directories. Agent teams exist for independent Claude Code sessions that can communicate, but are explicitly a different and more experimental layer than simple subagents.
This makes Claude Code excellent for engineering execution and review, but it is not the best single operating layer for Bob because it is terminal/project-session centered rather than multi-channel, always-on, company-memory centered.
Sources:
- https://code.claude.com/docs/en/sub-agents
- https://code.claude.com/docs/en/features-overview
- https://support.claude.com/en/articles/14553413-claude-code-cheatsheet
Codex/OpenAI native
Codex should be part of the worker bench, especially for OpenAI model fallback, code security review, and long-running coding tasks. It is not currently the best primary company OS either. The right posture is provider-neutral: keep Bob and the company memory above any one model vendor, then use Claude, Codex, and other models where they are strongest.
Source: https://openai.com/index/unrolling-the-codex-agent-loop/
Recommendation By Decision
Stay on OpenClaw?
Yes, for the next 60-90 days. OpenClaw remains the best fit for:
- Discord/Telegram first workflow
- always-on heartbeats
- multi-agent routing
- project workspaces
- local/self-hosted control
- current setup continuity
Do not make a risky platform migration while MedSchools.ai is still the primary business launch.
Move to Hermes?
Pilot it, do not fully migrate yet.
Success criteria for a Hermes pilot:
- Can import or mirror current OpenClaw memory/workspace safely.
- Can execute 3 repeatable workflows with less Henry involvement than current Bob flow.
- Saves every artifact back to Second Brain.
- Does not weaken project isolation.
- Passes security/secret handling review.
- Produces measurable time/cost/quality improvement after two weeks.
If it passes, promote Hermes from "lab worker" to "execution engine" behind OpenClaw. If it fails, keep the learnings and stay on OpenClaw.
Move to Claude native?
No as the primary OS. Yes as a core engineering runtime.
Use Claude Code native subagents and agent teams for:
- codebase exploration
- implementation
- review
- QA
- security audits
- isolated research
But keep Bob's operating layer outside Claude Code so the company is not trapped inside a terminal session model.
Adopt gstack/gbrain?
Yes, selectively.
Adopt gstack's pipeline as operating discipline.
Adopt gbrain's memory/synthesis behavior.
Do not blindly replace our existing Second Brain until we compare schemas and migration risk.
Target Operating Model
Layer 1: Channels
Discord remains the primary control channel. Telegram can remain as backup/mobile. Later add Gmail, Calendar, GitHub, Slack/Teams only when the workflows are mature.
Layer 2: Gateway
OpenClaw stays the gateway. It handles routing, channel policy, group chat behavior, message formatting, sessions, heartbeat, cron, and tool exposure.
Layer 3: Orchestrator
Bob remains the chief-of-staff agent. Bob should not personally execute deep work. Bob should:
- clarify goals
- break work into tasks
- assign agents
- enforce project boundaries
- review outputs
- escalate decisions to Henry
- save decisions/research to Second Brain
Layer 4: Project PM Agents
Each product gets a project PM with its own workspace, memory, and infrastructure registry:
- Kevin: MedSchools.ai
- Liz: Hedge
- Future: CollegeDojo PM
- Future: Real Estate PM
- Future: WiderWings platform PM
PMs coordinate specialists but do not share project env vars or Supabase projects.
Layer 5: Specialist Workers
Kai, Atlas, Mia, Sage, Lux, QA, Growth. They should have project-scoped worktrees and explicit access boundaries.
Layer 6: Memory
Second Brain becomes the source of truth:
- memories by type: decision, research, spec, lesson, conversation, task
- project_id required
- citations/source links required for research
- "unknowns/gaps" required for strategic answers
- automatic save on completion
Add a gbrain-like MCP/query layer if it outperforms the current API search.
Layer 7: Governance
Mission Control becomes the dashboard for:
- task status
- agent ownership
- blockers
- costs
- quality gates
- weekly review
- business KPIs
Context Separation Rules
This is non-negotiable:
- One project, one workspace.
- One project, one Supabase project.
- One project, one .env boundary.
- Project PMs can read their own project context only by default.
- Bob can coordinate across projects but should not paste one project's private details into another project channel.
- Cross-project shared infrastructure lives under WiderWings only after explicit architecture approval.
Implementation:
- Add a
PROJECTS.mdcheck as a required preflight for any db/API/infrastructure task. - Add project-scoped AGENTS.md/CLAUDE.md files in every repo.
- Add
.env.exampleonly, never real env copying. - Configure agent tool permissions by project.
- Add an automated "wrong Supabase ID" guard to deployment/check scripts.
30-Day Implementation Plan
Week 1: Stabilize Current Setup
- Audit OpenClaw config, agents, channels, and model routing.
- Rotate plaintext exposed channel/API tokens and move secrets into a safer secret store or env-backed config.
- Lock down group/DM policies and sandbox non-main sessions.
- Create project-isolation preflight checklist.
- Create a "Bob delegation contract" so deep work routes to agents by default.
Week 2: Memory Upgrade
- Define Second Brain memory schema v2 with required project_id, source links, artifact paths, confidence, and gaps.
- Add save/retrieve commands for Bob and every agent.
- Add "recall before answer" behavior for prior work and decisions.
- Evaluate gbrain locally against our existing Second Brain on 10 real queries.
Week 3: Agent Operating Cadence
- Create project PM workspaces for MedSchools.ai and Hedge with clear permissions.
- Add gstack-style workflow docs as reusable skills/checklists.
- Add review/QA/security gates to Mission Control.
- Define weekly CEO briefing output.
Week 4: Hermes Pilot
- Install Hermes in a sandbox, not production.
- Connect it to a copy or limited view of memory.
- Test 3 workflows:
- research synthesis -> saved to Second Brain
- QA/playwright regression -> report only
- SEO/content ops checklist -> draft artifact only
- Compare quality, speed, cost, and failure modes against OpenClaw/Codex/Claude workers.
My Call
The winning setup is hybrid:
- OpenClaw = company gateway and always-on interface
- Second Brain/gbrain-style layer = durable memory and synthesis
- Bob = orchestrator and decision filter
- Project PM agents = context separation and ownership
- Claude Code/Codex = engineering workbenches
- Hermes = experimental self-improving execution backend
- Mission Control = visibility, governance, and task state
Do not chase the newest platform. The bottleneck is not only model capability. The bottleneck is operating discipline, memory, delegation, permissions, and review gates. We can fix those inside the current setup faster than migrating everything.
Sources
- Garry Tan gstack: https://github.com/garrytan/gstack
- Garry Tan gbrain: https://github.com/garrytan/gbrain
- OpenClaw docs: https://docs.openclaw.ai/
- OpenClaw GitHub: https://github.com/openclaw/openclaw
- Hermes Agent GitHub: https://github.com/NousResearch/hermes-agent
- Hermes docs hub: https://hermes-agent.app/en/docs
- Claude Code subagents: https://code.claude.com/docs/en/sub-agents
- Claude Code feature overview: https://code.claude.com/docs/en/features-overview
- Claude Code cheatsheet: https://support.claude.com/en/articles/14553413-claude-code-cheatsheet
- OpenAI Codex agent loop: https://openai.com/index/unrolling-the-codex-agent-loop/
- WildClawBench paper: https://arxiv.org/abs/2605.10912
Created: Sun, May 24, 2026, 7:19 PM by bob
Updated: Sun, May 24, 2026, 7:19 PM
Last accessed: Wed, Jun 3, 2026, 12:28 PM
ID: c89183e2-f839-41e6-bf5f-93cf89587283