A local-first AI lab where CodexRouter makes every model call
economically intelligent, and the second brain makes every workflow
memorable, secure, executable, and reusable.
A cinematic map of the real system: Hermes and the memory graph form
the brain, CodexRouter and model gateways form the economic engine,
and OpenClaw, Codex, nodes, data, and MCP tools become the hands.
Continuity Engine · Public Case Study
Many Agents. One Operational Memory.
Two execution paths reveal the same design principle: every task can
begin with relevant context and end as verified continuity—whether
Hermes orchestrates the work or a specialist tool is opened directly.
Principle-level view
Public architecture story. Internal thresholds, schemas, routing policy, and reconciliation logic remain private.
Load the smallest useful memory, not the history of everything.
02
Evidence-anchored work
Meaning lives in memory; exact execution state remains provable in Git.
03
Curated promotion
Verified learning survives. Noise, secrets, and raw transcripts do not.
04
Bypass reconciliation
Opening a specialist directly never creates a second operational brain.
CodexRouter Engine
The Broker That Makes Model Calls Economically Intelligent
CodexRouter is the LLM/API economics layer. It reads the request,
scores the task, chooses the cheapest capable model, and keeps the
fallback chain honest when context, tools, rate limits, or provider
failures change the route.
01
Classify Intent
Weighted local scoring reads prompt complexity, code markers,
reasoning signals, context size, structured-output needs, and
agentic execution patterns.
02
Price the Work
The router compares input and output economics against a premium
baseline so expensive frontier models are reserved for work that
actually needs them.
03
Select the Route
Auto, eco, premium, and agentic profiles map simple, medium,
complex, and reasoning tasks to the best price/performance lane.
04
Recover Gracefully
Context filtering, tool and vision checks, fallback chains, cache,
deduplication, compression, and session pinning keep agent loops
efficient under real operating pressure.
Operating Thesis
Agents Become Useful When They Remember
Bigger models are not enough. A real agentic system needs a broker
that reads intent, estimates complexity, understands context limits,
and routes each task to the cheapest model that can still do the job.
Recall Before Acting
Hermes can search structured memory before planning so every task
does not restart from zero context.
Promote Only Durable Knowledge
Important decisions, preferences, and service relationships move
into Graphiti or Obsidian. Raw logs and secrets do not.
Route for Cost and Quality
CodexRouter exposes model economics so agent loops can reserve
expensive models for the work that actually needs them.
Architecture Atlas
The AI Lab Control Plane
The public site is backed by a real operating model: Hermes
orchestrates the work, Graphiti and FalkorDB provide durable recall,
CodexRouter chooses the best model route for the task economics,
secure fabric keeps secrets and access controlled, and execution
surfaces turn plans into working systems.
Hosted + Local Models Frontier, specialist, and oMLX routes
Trust Fabric
Tailscale Network Private paths between lab nodes
Vault + Access Policy Secrets, permissions, and service boundaries
Execution Layer
OpenClaw + OpenHands + Codex Remote and agentic execution surfaces
Clawnode / Mac mini Trusted local node and tools
Agentic Workflow Loop
Task arrives from user, API, or remote agent.
Hermes recalls Graphiti context and relevant Obsidian notes.
CodexRouter classifies intent and chooses the best price/performance route.
Tools, OpenClaw, OpenHands, Codex, or Clawnode execute bounded work.
Hermes verifies, promotes durable memory, and returns a traceable result.
Portability Model
Git tracks configs, docs, scripts, and diagrams.
Graphiti and Obsidian hold recoverable memory layers.
Secrets stay in Vault or secure service config and are never printed.
The Mac mini and VPS can be rebuilt from explicit paths and runbooks.
Memory Lifecycle
From Task to Durable Second Brain
The lab separates chat noise from durable knowledge. Graphiti stores
compact relationships. Obsidian stores readable procedures. Secrets
and raw logs stay out of memory.
01
Recall
Search prior facts, preferences, topology, and decisions before planning.
02
Decide
Hermes chooses the right executor, model, tool, or remote gateway.
03
Act
Use local tools, Codex, Claude Code, OpenClaw, browser automation, or shell.
04
Promote
Store only durable, safe, future-useful knowledge in graph memory or notes.
05
Audit
Keep memory human-readable, correctable, portable, and recoverable.
Applied Projects
Where the Lab Becomes Real Work
The website can grow into a portfolio of workflows built with the
same foundation: memory, routing, governance, and token-aware
orchestration.
Foundation
CodexRouter AI Lab
The base system: intent classification, model economics,
second-brain memory, local-first orchestration, and remote
execution paths. This is the engine that supports future workflow
demos.
HermesGraphitiLiteLLMDirect gateway
In progress
Adobe Firefly Workflow Lab
A solution-architecture demo for creative content supply chains:
campaign brief intake, Firefly-ready prompt variants, governance
checks, approval metadata, and campaign pack output.
Firefly-style APIsBrand safetyGovernanceDemo lab
Roadmap
Agent Workflow Case Studies
Future pages can show how the same architecture handles research,
code generation, infrastructure repair, creative production, and
knowledge management.
Case studiesRecruiter proofEnterprise workflows
Builder Mission
Technology That Compounds Into Business Outcomes
My work is to turn advanced AI capability into practical systems:
modular, secure, measurable, and built to grow with the project.
The lab is intentionally layered so every component can improve over
time without breaking the whole operating model.
Modular by design
Memory, routing, models, security, and execution are independent
layers. Any layer can be upgraded as better tools emerge.
Outcome focused
The point is not tool collection. It is faster decisions, lower
token waste, better governance, and workflows that can be repeated.
Built like a brain
Hermes reasons, Graphiti recalls, CodexRouter routes, models
generate, tools act, and verified knowledge returns to memory.
Cost-Aware Routing Layer
Model Price Field
Every agent loop has a cost shape. The broker has to know when a
cheap local delegate is enough, when a hosted model is worth it, and
when cache leverage changes the economics.
Routing Surface
Provider Intelligence Wall
Provider notes, source links, free endpoints, and cache assumptions
stay visible so the agent system can route with evidence instead of
opaque preference.
Routing Manifest
Model Economics Matrix
This table is the public-readable version of the economics layer:
aliases, providers, pricing modes, cache rates, and free endpoints.
Model
Provider
Input
Output
Cache Read
Cache Write
Mode
Token Efficiency
Agent Loop Cost Calculator
Estimate the cost of real agent work: context recall, tool planning,
output generation, and cache-aware repeated workflows.
Governance
How the Lab Avoids Memory and Cost Drift
A second brain is only useful if it stays clean. A model broker is
only useful if its economics are visible and tested.
Canonical Manifest
Pricing and model assumptions live in one reviewed manifest instead
of being scattered across prompts, notes, and hidden config.
Memory Promotion Rules
Durable relationships go to Graphiti. Human-readable procedures go
to Obsidian. Raw logs, transient output, and secrets stay out.
Portable Recovery
The lab is designed around explicit directories, tracked docs,
backup paths, and recoverable memory instead of opaque app state.
Founder & Principal Architect
Hakim Ghelab — Founder of CodexRouter AI Lab
Building a hybrid agentic AI architecture where local intelligence,
cloud models, graph memory, and operational agents work as one
coordinated system.
Hakim Ghelab is a London-based principal AI architect and founder of CodexRouter AI
Lab, built through Vegalaboratories Ltd. He brings more than a decade of experience
as a solutions architect and sales engineer across enterprise networking,
cybersecurity, and hybrid-cloud infrastructure, with roles at
Check Point, Palo Alto Networks, Cisco, and Riverbed.
That background pairs hands-on architecture and security delivery with enterprise
go-to-market and founder-level product vision — now the foundation of CodexRouter AI
Lab: a founder-led agentic AI lab building the routing, memory, orchestration, and
control-plane layer for hybrid local/cloud AI systems.
“CodexRouter is not just a website or a wrapper. It is the control layer for a lab
where agents, models, memory, and infrastructure become one operating system.”
Founder Story
Hakim builds the connective tissue between agents, models, memory, tools, and
infrastructure. His architecture combines a broker-style model router that selects
the most appropriate model based on task intent, capability, latency, cost, and
privacy; a local-first AI brain running on Apple Silicon; cloud-hosted operational
agents; graph memory that persists what matters; and a zero-trust operating layer
that keeps credentials, private context, and sensitive memory outside public model
recall.
He designs, builds, tests, and documents the architecture himself — operating as an
author-architect: part systems designer, part technical strategist, part
builder, and part storyteller.
What I am building
RoutingCodexRouter
An API broker and routing layer that scores prompt intent, complexity, and context, then selects the most appropriate model based on capability, latency, cost, and privacy across many LLMs and providers.
BrainHermes
The local planner, verifier, and orchestrator running on a Mac mini node — deciding what to recall, route, execute, and promote.
HandsOpenClaw & OpenHands
Cloud and local operational agents that turn a chosen route into bounded, observable action across tools and APIs.
EngineLiteLLM + local MLX
A gateway to frontier cloud models alongside local Apple-Silicon inference for privacy, latency, and cost control.
MemoryGraph memory
Graphiti and FalkorDB-style long-term memory, with Hermes promoting only durable, non-secret facts from context into the graph.
TrustZero-trust fabric
Tailscale mesh access, HashiCorp Vault for secret storage and rotation, and Authentik SSO/MFA across containerised services.
The architecture, end to end
Human / Founder
Mission Control UI · Website · CLI
CodexRouter
Hermes — main brain
LiteLLM gateway + local model server
OpenClaw cloud agent + Mac mini node
Graph memory layer
Tools, APIs, automation & operations
Founder principles
Build real systems, not demos. Everything here runs in production, not slideware.
Memory is infrastructure. Durable recall is designed in, not bolted on.
Local and cloud intelligence cooperate. Private edge compute and frontier models in one fabric.
Agents need governance and observability. Routing, audit, and recovery paths, not just prompts.
Simple surfaces, disciplined architecture. The best AI products feel simple because the system underneath is rigorous.
Press bio (copy-ready)
Short — 50 words
Hakim Ghelab is a London-based principal AI architect and founder of CodexRouter AI Lab. After a decade as a solutions architect and sales engineer at Check Point, Palo Alto Networks, Cisco, and Riverbed, he now builds an enterprise-grade, model-agnostic agentic AI lab spanning routing, graph memory, orchestration, and a zero-trust control plane.
Medium — 150 words
Hakim Ghelab is a London-based principal AI architect and the founder and principal architect of CodexRouter AI Lab, built through Vegalaboratories Ltd. He spent over a decade as a solutions architect and sales engineer across enterprise networking, cybersecurity, and hybrid-cloud infrastructure — at Check Point, Palo Alto Networks, Cisco, and Riverbed — pairing hands-on architecture with enterprise go-to-market and founder-level product vision. He now applies that foundation to a founder-led, enterprise-grade agentic AI lab: CodexRouter brokers and routes model, tool, and agent traffic; Hermes acts as the local brain; OpenClaw and MCP-style tools act as hands; graph memory persists what matters; and a zero-trust fabric (Tailscale, Vault, Authentik) keeps credentials and sensitive context outside public model recall. The lab is model- and provider-agnostic, pushing both local Apple-Silicon and cloud frontier models to their best — secure, observable, and operated as one coherent system.
Clear answers for human readers, search engines, and LLM-based search
systems.
What is the CodexRouter AI Lab?
It is a local-first AI systems lab for building agentic workflows
that remember context, route models intelligently, and keep memory
auditable.
Why do agents need a second brain?
Stateless agents repeat context gathering, forget decisions, waste
tokens, and force operators to re-explain the same architecture.
How does the memory layer work?
Hermes promotes durable facts and relationships into Graphiti while
Obsidian keeps human-readable runbooks, notes, and decisions.
Where does CodexRouter fit?
CodexRouter is the broker and economics layer. It helps the agent
system choose models with better cost, cache, and routing visibility.
Who is Hakim Ghelab?
Hakim Ghelab is the Founder & Principal Architect of CodexRouter AI Lab, a
London-based principal AI architect who builds the routing, memory, orchestration,
and control-plane layer for hybrid local/cloud AI systems through Vega Laboratories
Ltd, after a decade across enterprise networking, cybersecurity, and hybrid-cloud
infrastructure.
What is an author-architect?
Someone who designs, builds, tests, and documents the system themselves — combining
technical authorship, system design, product storytelling, and hands-on
implementation rather than only directing others.
What makes this different from a normal chatbot?
A chatbot is one isolated assistant. CodexRouter AI Lab is the connective tissue
between many agents, models, memory, tools, and infrastructure — with routing,
durable memory, governance, observability, and recovery paths under human control.
Why combine local and cloud AI?
Local Apple-Silicon inference gives privacy, low latency, and cost control for
routine work; frontier cloud models handle the hardest tasks. CodexRouter routes
each request to the most appropriate model based on capability, latency, cost, and
privacy across both.
Contact
Work with Hakim
For senior AI architecture roles, strategic AI-adoption leadership, partnerships, or
applied AI-lab projects, the best way to reach Hakim Ghelab is on LinkedIn.