Architecture
Nexus is a controller-runtime operator with an embedded REST API. The control plane lives in the nexus-system namespace; tenants and their nodes live in their own namespaces.
Control-Plane Components
Section titled “Control-Plane Components”| Component | Detail |
|---|---|
| REST API | Gin HTTP server on :8080; turns client requests into custom resources |
| SQLite store | modernc.org/sqlite — users, tenants, tenant↔user associations, API keys, refresh tokens |
| Auth | JWT via golang-jwt (HMAC-SHA256), bcrypt password hashing (cost 12) |
| NexusTenantReconciler | Provisions tenant namespaces, nodes, and services |
| NexusAgentReconciler | Registers and deploys agents to tenant nodes |
| Metrics | Prometheus endpoint on :9090 |
| Health probes | Manager liveness/readiness on :8081 |
Built with Go 1.25, Kubebuilder v4, and controller-runtime. The CRD API group is nexus.astromesh.io/v1alpha1.
Kubernetes Cluster ┌───────────────────────────────────────────────────────────┐ │ nexus-system namespace │ │ ┌─────────────────────────────────────────────────┐ │ │ │ Nexus Control Plane │ │ │ │ REST API (Gin :8080) NexusTenantReconciler │ │ │ │ SQLite Store NexusAgentReconciler │ │ │ └─────────────────────────────────────────────────┘ │ │ │ │ tenant-a namespace tenant-b namespace │ │ astromesh-node (:8000) astromesh-node (:8000) │ │ agent-1, agent-2 agent-a │ └───────────────────────────────────────────────────────────┘ ▲ │ POST /api/v1/agents (X-API-Key / Bearer JWT) User / Cortex / LeiaReconcilers
Section titled “Reconcilers”NexusTenantReconciler
Section titled “NexusTenantReconciler”Watches NexusTenant resources in nexus-system. Each reconcile loop:
- Adds a cleanup finalizer (
nexus.astromesh.io/tenant-cleanup). - Sets phase to
Provisioning. - Ensures the tenant namespace exists (labeled
nexus.astromesh.io/managed=true). - Ensures the
astromesh-nodeDeployment (1 replica, container port8000, liveness/readiness probes on/v1/health). Resource limits are applied fromspec.resourceQuota. - Ensures the
astromesh-nodeService on port8000. - Sets phase to
Ready, recordsstatus.namespace, and setsstatus.nodeEndpointtohttp://astromesh-node.<namespace>.svc:8000. - Requeues after 60s.
On deletion, the finalizer deletes the tenant namespace before it is removed. Errors set phase Error and requeue after 30s.
NexusAgentReconciler
Section titled “NexusAgentReconciler”Watches NexusAgent resources in tenant namespaces. Each reconcile loop:
- Adds a cleanup finalizer (
nexus.astromesh.io/agent-cleanup). - Sets phase to
Deploying. - Resolves the tenant’s node endpoint by listing
NexusTenantCRs innexus-systemand matching the one whosestatus.namespaceequals the agent’s namespace and whose phase isReady. - Registers the agent:
POST {node}/v1/agents. - Deploys the agent:
POST {node}/v1/agents/{name}/deploy. - Sets phase to
Running,status.nodeAck = true, andstatus.lastSyncedAt = now. - Requeues after 60s.
On deletion, the finalizer calls DELETE {node}/v1/agents/{name} (errors ignored — the node may already be gone). Errors set phase Failed and requeue after 30s.
Multi-Tenant Isolation Model
Section titled “Multi-Tenant Isolation Model”- One namespace per tenant. The tenant’s CR name is also its namespace name.
- One node per tenant. Each namespace runs its own
astromesh-node; agents never cross tenant boundaries. - Scoped credentials. API keys are tenant-scoped; JWT callers must own a tenant (verified via
UserOwnsTenant) and passX-Tenant-IDto act within it. - Quota enforcement.
spec.resourceQuotadrives the node’s CPU/memory limits and amaxAgentsceiling.
Custom Resource Definitions
Section titled “Custom Resource Definitions”API group: nexus.astromesh.io/v1alpha1.
NexusTenant
Section titled “NexusTenant”Namespaced, created in nexus-system.
Spec
| Field | Type | Description |
|---|---|---|
displayName | string | Human-readable tenant name |
nodeProfile | string | Node profile, e.g. "full" (passed to the node as ASTROMESH_ROLE) |
resourceQuota.maxAgents | int | Maximum number of agents |
resourceQuota.cpu | string | CPU limit for the node (e.g. "4") |
resourceQuota.memory | string | Memory limit for the node (e.g. "8Gi") |
Status
| Field | Type | Description |
|---|---|---|
phase | enum | Pending | Provisioning | Ready | Error |
namespace | string | Provisioned tenant namespace |
nodeEndpoint | string | In-cluster node URL (http://astromesh-node.<ns>.svc:8000) |
agentCount | int | Number of agents in the tenant |
conditions | []Condition | Standard Kubernetes conditions |
apiVersion: nexus.astromesh.io/v1alpha1kind: NexusTenantmetadata: name: tenant-dev namespace: nexus-systemspec: displayName: "Development Tenant" nodeProfile: full resourceQuota: maxAgents: 10 cpu: "4" memory: "8Gi"NexusAgent
Section titled “NexusAgent”Namespaced, created in the tenant’s namespace.
Spec
| Field | Type | Description |
|---|---|---|
agentSpec | RawExtension | A pass-through carrying a full astromesh/v1 Agent spec (unknown fields preserved) |
Status
| Field | Type | Description |
|---|---|---|
phase | enum | Pending | Deploying | Running | Failed | Stopped |
lastSyncedAt | timestamp | Last successful sync with the node |
nodeAck | bool | Whether the node acknowledged the deploy |
conditions | []Condition | Standard Kubernetes conditions |
apiVersion: nexus.astromesh.io/v1alpha1kind: NexusAgentmetadata: name: support-bot namespace: tenant-devspec: agentSpec: apiVersion: astromesh/v1 kind: Agent metadata: { name: support-bot, version: "1.0.0" } spec: identity: { display_name: "Support Bot" } model: primary: { provider: ollama, model: "llama3.1:8b" } orchestration: { pattern: react }State Machines
Section titled “State Machines”Tenant phases
Section titled “Tenant phases” Pending ──> Provisioning ──> Ready │ └──> Error ──(requeue 30s)──> ProvisioningAgent phases
Section titled “Agent phases” Pending ──> Deploying ──> Running │ └──> Failed ──(requeue 30s)──> Deploying
(Stopped is a terminal/manual state)What’s Next
Section titled “What’s Next”- Quickstart — bootstrap on Kind and deploy an agent.
- API Reference — endpoints and auth.