Agentic Operating System Recommendation

Generated: 2026-05-24

Executive Recommendation

Do not do a full migration off OpenClaw right now. Keep OpenClaw as the always-on messaging gateway, project/session router, and human interface. Add two upgrades around it:

A gbrain-style permanent memory/synthesis layer for company knowledge, project context, decisions, and research.
A Hermes pilot as a sandboxed self-improving execution engine for repeatable workflows, not as the primary operating system yet.

Use Claude Code and Codex native environments as specialist worker runtimes for software engineering, code review, QA, and security, but do not make either one the whole company operating system. They are excellent workbenches; they are not the best always-on multi-channel chief-of-staff layer.

The target architecture should be:

Henry -> Discord/Telegram -> OpenClaw Gateway -> Bob as orchestrator -> project-scoped PM agents -> specialist execution workers -> Second Brain/gbrain-style durable memory -> Mission Control task board

Hermes should sit beside this as a learning execution backend:

OpenClaw routes selected repeatable tasks -> Hermes executes/learns/updates skills -> result + lessons saved back to Second Brain.

Why

Henry's actual goal is not "better chat with AI." The goal is to run a one-person startup studio capable of launching several products and building toward $100M enterprise value in five years. That needs:

Always-on access from the channels Henry actually uses.
Project isolation so MedSchools.ai, Hedge, WiderWings, CollegeDojo, and real estate never contaminate context or infrastructure.
Durable company memory, not fragile chat transcripts.
Autonomous delegation with review gates.
A repeatable operating cadence: think, plan, build, review, test, ship, reflect.
Security boundaries around tools, databases, browser access, and secrets.
Measurable agent performance.

OpenClaw is already closest to the always-on company OS. Hermes is promising for self-improvement. Claude Code/Codex are strongest as coding/runtime workers. gstack/gbrain provide the missing discipline and memory pattern.

Findings

gstack

Garry Tan's gstack is best understood as process discipline, not a platform. Its value is the pipeline: product thinking -> planning -> engineering review -> build -> review -> QA -> ship -> retro. The important pattern is that every step produces an artifact the next step reads.

This should be copied into our operating process, and we already started doing that in AGENTS.md.

Source: https://github.com/garrytan/gstack

gbrain

GBrain is the more important idea for our company setup. It turns memory from "search found pages" into "synthesized answer with citations, freshness checks, and gap analysis." It can run locally with PGLite, at scale with Postgres/Supabase, and expose tools over MCP.

This maps directly to our Second Brain. We do not necessarily need to replace Second Brain with gbrain. We need to make Second Brain behave more like gbrain:

Every important answer has citations back to source memories/docs.
Every answer includes "what I do not know yet."
Every project has isolated memory namespaces.
Agents save research, decisions, specs, and lessons automatically.
Repeated workflows become skills/playbooks.

Source: https://github.com/garrytan/gbrain

OpenClaw

OpenClaw is strongest as the self-hosted multi-channel gateway and agent control plane. Official docs describe it as a gateway across Discord, Telegram, Slack, Signal, WhatsApp, iMessage, etc., with sessions, memory, tools, and multi-agent routing. Its GitHub README highlights local-first gateway, multi-channel inbox, multi-agent routing, first-class tools, cron, sessions, and sandboxing options.

For Henry, this matters because Discord/Telegram access and always-on heartbeats are part of the actual workflow. Claude Code alone does not replace that.

Weaknesses: configuration complexity, security risk if exposed or over-permissioned, plaintext secret hygiene risk, and less native self-improvement than Hermes.

Sources:

Hermes

Hermes is attractive because its core pitch is the built-in learning loop: it creates skills from experience, improves them during use, searches past conversations, persists knowledge, and can migrate OpenClaw settings/memory/skills. Current GitHub docs also describe messaging gateway support, cron, memory, MCP, tools, and OpenClaw migration.

The best use is a pilot for repetitive operations:

recurring research workflows
SEO/content production flows
lead research
QA regression playbooks
deployment runbooks
project health checks

I would not put Hermes in charge of the whole company yet. It is younger and the "self-improving" loop needs evals and review gates before we trust it on production business operations.

Sources:

Claude Code native

Claude Code is very strong for coding work. Official docs now support custom subagents, background execution, scoped MCP servers, skills preloading, hooks, and persistent memory directories. Agent teams exist for independent Claude Code sessions that can communicate, but are explicitly a different and more experimental layer than simple subagents.

This makes Claude Code excellent for engineering execution and review, but it is not the best single operating layer for Bob because it is terminal/project-session centered rather than multi-channel, always-on, company-memory centered.

Sources:

Codex/OpenAI native

Codex should be part of the worker bench, especially for OpenAI model fallback, code security review, and long-running coding tasks. It is not currently the best primary company OS either. The right posture is provider-neutral: keep Bob and the company memory above any one model vendor, then use Claude, Codex, and other models where they are strongest.

Source: https://openai.com/index/unrolling-the-codex-agent-loop/

Recommendation By Decision

Stay on OpenClaw?

Yes, for the next 60-90 days. OpenClaw remains the best fit for:

Discord/Telegram first workflow
always-on heartbeats
multi-agent routing
project workspaces
local/self-hosted control
current setup continuity

Do not make a risky platform migration while MedSchools.ai is still the primary business launch.

Move to Hermes?

Pilot it, do not fully migrate yet.

Success criteria for a Hermes pilot:

Can import or mirror current OpenClaw memory/workspace safely.
Can execute 3 repeatable workflows with less Henry involvement than current Bob flow.
Saves every artifact back to Second Brain.
Does not weaken project isolation.
Passes security/secret handling review.
Produces measurable time/cost/quality improvement after two weeks.

If it passes, promote Hermes from "lab worker" to "execution engine" behind OpenClaw. If it fails, keep the learnings and stay on OpenClaw.

Move to Claude native?

No as the primary OS. Yes as a core engineering runtime.

Use Claude Code native subagents and agent teams for:

codebase exploration
implementation
review
QA
security audits
isolated research

But keep Bob's operating layer outside Claude Code so the company is not trapped inside a terminal session model.

Adopt gstack/gbrain?

Yes, selectively.

Adopt gstack's pipeline as operating discipline.
Adopt gbrain's memory/synthesis behavior.
Do not blindly replace our existing Second Brain until we compare schemas and migration risk.

Target Operating Model

Layer 1: Channels

Discord remains the primary control channel. Telegram can remain as backup/mobile. Later add Gmail, Calendar, GitHub, Slack/Teams only when the workflows are mature.

Layer 2: Gateway

OpenClaw stays the gateway. It handles routing, channel policy, group chat behavior, message formatting, sessions, heartbeat, cron, and tool exposure.

Layer 3: Orchestrator

Bob remains the chief-of-staff agent. Bob should not personally execute deep work. Bob should:

clarify goals
break work into tasks
assign agents
enforce project boundaries
review outputs
escalate decisions to Henry
save decisions/research to Second Brain

Layer 4: Project PM Agents

Each product gets a project PM with its own workspace, memory, and infrastructure registry:

Kevin: MedSchools.ai
Liz: Hedge
Future: CollegeDojo PM
Future: Real Estate PM
Future: WiderWings platform PM

PMs coordinate specialists but do not share project env vars or Supabase projects.

Layer 5: Specialist Workers

Kai, Atlas, Mia, Sage, Lux, QA, Growth. They should have project-scoped worktrees and explicit access boundaries.

Layer 6: Memory

Second Brain becomes the source of truth:

memories by type: decision, research, spec, lesson, conversation, task
project_id required
citations/source links required for research
"unknowns/gaps" required for strategic answers
automatic save on completion

Add a gbrain-like MCP/query layer if it outperforms the current API search.

Layer 7: Governance

Mission Control becomes the dashboard for:

task status
agent ownership
blockers
costs
quality gates
weekly review
business KPIs

Context Separation Rules

This is non-negotiable:

One project, one workspace.
One project, one Supabase project.
One project, one .env boundary.
Project PMs can read their own project context only by default.
Bob can coordinate across projects but should not paste one project's private details into another project channel.
Cross-project shared infrastructure lives under WiderWings only after explicit architecture approval.

Implementation:

Add a PROJECTS.md check as a required preflight for any db/API/infrastructure task.
Add project-scoped AGENTS.md/CLAUDE.md files in every repo.
Add .env.example only, never real env copying.
Configure agent tool permissions by project.
Add an automated "wrong Supabase ID" guard to deployment/check scripts.

30-Day Implementation Plan

Week 1: Stabilize Current Setup

Audit OpenClaw config, agents, channels, and model routing.
Rotate plaintext exposed channel/API tokens and move secrets into a safer secret store or env-backed config.
Lock down group/DM policies and sandbox non-main sessions.
Create project-isolation preflight checklist.
Create a "Bob delegation contract" so deep work routes to agents by default.

Week 2: Memory Upgrade

Define Second Brain memory schema v2 with required project_id, source links, artifact paths, confidence, and gaps.
Add save/retrieve commands for Bob and every agent.
Add "recall before answer" behavior for prior work and decisions.
Evaluate gbrain locally against our existing Second Brain on 10 real queries.

Week 3: Agent Operating Cadence

Create project PM workspaces for MedSchools.ai and Hedge with clear permissions.
Add gstack-style workflow docs as reusable skills/checklists.
Add review/QA/security gates to Mission Control.
Define weekly CEO briefing output.

Week 4: Hermes Pilot

Install Hermes in a sandbox, not production.
Connect it to a copy or limited view of memory.
Test 3 workflows:
- research synthesis -> saved to Second Brain
- QA/playwright regression -> report only
- SEO/content ops checklist -> draft artifact only
Compare quality, speed, cost, and failure modes against OpenClaw/Codex/Claude workers.

My Call

The winning setup is hybrid:

OpenClaw = company gateway and always-on interface
Second Brain/gbrain-style layer = durable memory and synthesis
Bob = orchestrator and decision filter
Project PM agents = context separation and ownership
Claude Code/Codex = engineering workbenches
Hermes = experimental self-improving execution backend
Mission Control = visibility, governance, and task state

Do not chase the newest platform. The bottleneck is not only model capability. The bottleneck is operating discipline, memory, delegation, permissions, and review gates. We can fix those inside the current setup faster than migrating everything.

Sources

Garry Tan gstack: https://github.com/garrytan/gstack
Garry Tan gbrain: https://github.com/garrytan/gbrain
OpenClaw docs: https://docs.openclaw.ai/
OpenClaw GitHub: https://github.com/openclaw/openclaw
Hermes Agent GitHub: https://github.com/NousResearch/hermes-agent
Hermes docs hub: https://hermes-agent.app/en/docs
Claude Code subagents: https://code.claude.com/docs/en/sub-agents
Claude Code feature overview: https://code.claude.com/docs/en/features-overview
Claude Code cheatsheet: https://support.claude.com/en/articles/14553413-claude-code-cheatsheet
OpenAI Codex agent loop: https://openai.com/index/unrolling-the-codex-agent-loop/
WildClawBench paper: https://arxiv.org/abs/2605.10912

Agentic Operating System Recommendation: OpenClaw + Second Brain/gbrain + Hermes Pilot

Agentic Operating System Recommendation

Executive Recommendation

Why

Findings

gstack

gbrain

OpenClaw

Hermes

Claude Code native

Codex/OpenAI native

Recommendation By Decision

Stay on OpenClaw?

Move to Hermes?

Move to Claude native?

Adopt gstack/gbrain?

Target Operating Model

Layer 1: Channels

Layer 2: Gateway

Layer 3: Orchestrator

Layer 4: Project PM Agents

Layer 5: Specialist Workers

Layer 6: Memory

Layer 7: Governance

Context Separation Rules

30-Day Implementation Plan

Week 1: Stabilize Current Setup

Week 2: Memory Upgrade

Week 3: Agent Operating Cadence

Week 4: Hermes Pilot

My Call

Sources