Helm / Kubernetes
This guide covers deploying Astromesh on Kubernetes using the official Helm chart. The chart packages the Astromesh runtime with optional PostgreSQL, Redis, Ollama, vLLM, HuggingFace TEI, and a full observability stack.
What and Why
Section titled “What and Why”The Helm chart provides a production-grade Kubernetes deployment with:
- Infrastructure as subcharts — PostgreSQL, Redis, and Ollama can be deployed alongside Astromesh or replaced with external managed services
- Model serving — optional vLLM and HuggingFace TEI deployments with GPU scheduling
- Observability — Prometheus, Grafana (via kube-prometheus-stack), and OpenTelemetry Collector
- Configuration as code — Astromesh YAML config (runtime, providers, agents, channels) defined inline in
values.yaml - Environment profiles — pre-built
values-dev.yaml,values-staging.yaml, andvalues-prod.yaml - External Secrets — optional ESO integration for AWS, GCP, and Vault secret management
- CRDs — Kubernetes-native Agent, Provider, Channel, and RAGPipeline resources
Prerequisites
Section titled “Prerequisites”| Requirement | Version | Check command |
|---|---|---|
| Kubernetes | 1.26+ | kubectl version |
| Helm | 3.12+ | helm version |
| kubectl | configured | kubectl cluster-info |
| Container registry | accessible | varies |
Chart Overview
Section titled “Chart Overview”Structure
Section titled “Structure”deploy/helm/astromesh/├── Chart.yaml # Chart metadata and dependencies├── values.yaml # Default configuration├── values-dev.yaml # Development overrides├── values-staging.yaml # Staging overrides├── values-prod.yaml # Production overrides├── crds/ # Custom Resource Definitions│ ├── agent-crd.yaml│ ├── provider-crd.yaml│ ├── channel-crd.yaml│ └── ragpipeline-crd.yaml└── templates/ ├── _helpers.tpl # Template helpers ├── deployment.yaml # Astromesh API deployment ├── service.yaml # ClusterIP service ├── ingress.yaml # Optional ingress ├── hpa.yaml # Horizontal Pod Autoscaler ├── configmap-runtime.yaml ├── configmap-providers.yaml ├── configmap-channels.yaml ├── configmap-agents.yaml ├── secret.yaml ├── serviceaccount.yaml ├── deployment-vllm.yaml ├── service-vllm.yaml ├── deployment-tei.yaml ├── service-tei.yaml ├── external-secret.yaml ├── secret-store.yaml └── NOTES.txtDependencies
Section titled “Dependencies”| Subchart | Repository | Default | Purpose |
|---|---|---|---|
postgresql | Bitnami | enabled | Episodic memory, pgvector semantic search |
redis | Bitnami | enabled | Conversational memory cache |
ollama | ollama-helm | disabled | Local model serving |
kube-prometheus-stack | prometheus-community | disabled | Prometheus + Grafana + AlertManager |
opentelemetry-collector | open-telemetry | disabled | Trace and metric collection |
Quick Start
Section titled “Quick Start”1. Add dependency repositories and update
Section titled “1. Add dependency repositories and update”cd deploy/helm/astromeshhelm dependency updateExpected output:
Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "bitnami" chart repositoryUpdate Complete. ⎈Happy Helming!⎈Saving 5 chartsDownloading postgresql from repo https://charts.bitnami.com/bitnamiDownloading redis from repo https://charts.bitnami.com/bitnamiDownloading ollama from repo https://otwld.github.io/ollama-helmDeleting outdated charts2. Install with dev values
Section titled “2. Install with dev values”helm install astromesh ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-dev.yaml \ --namespace astromesh \ --create-namespaceExpected output:
NAME: astromeshLAST DEPLOYED: Mon Mar 9 10:00:00 2026NAMESPACE: astromeshSTATUS: deployedREVISION: 1NOTES:Astromesh has been deployed!
API endpoint: http://astromesh.astromesh.svc:8000 Health check: kubectl port-forward svc/astromesh 8000:8000
To verify: curl http://localhost:8000/health3. Verify pods
Section titled “3. Verify pods”kubectl get pods -n astromeshExpected output:
NAME READY STATUS RESTARTS AGEastromesh-5d8f9c7b6-x2k4m 1/1 Running 0 60sastromesh-postgresql-0 1/1 Running 0 60sastromesh-redis-master-0 1/1 Running 0 60s4. Test the API
Section titled “4. Test the API”kubectl port-forward svc/astromesh 8000:8000 -n astromeshIn another terminal:
curl http://localhost:8000/healthExpected output:
{ "status": "healthy", "version": "0.10.0"}Configuration
Section titled “Configuration”Inline config in values.yaml
Section titled “Inline config in values.yaml”Astromesh configuration files are defined inline in values.yaml and mounted as ConfigMaps:
config: runtime: | apiVersion: astromesh/v1 kind: RuntimeConfig metadata: name: default spec: api: host: "0.0.0.0" port: 8000 defaults: orchestration: pattern: react max_iterations: 10
providers: | apiVersion: astromesh/v1 kind: ProviderConfig metadata: name: default-providers spec: providers: ollama: type: ollama endpoint: "http://ollama:11434" models: - "llama3.1:8b" health_check_interval: 30 openai: type: openai_compat endpoint: "https://api.openai.com/v1" api_key_env: OPENAI_API_KEY models: - "gpt-4o" - "gpt-4o-mini" routing: default_strategy: cost_optimized fallback_enabled: true circuit_breaker: failure_threshold: 3 recovery_timeout: 60
channels: | channels: whatsapp: verify_token: "${WHATSAPP_VERIFY_TOKEN}" access_token: "${WHATSAPP_ACCESS_TOKEN}" phone_number_id: "${WHATSAPP_PHONE_NUMBER_ID}" app_secret: "${WHATSAPP_APP_SECRET}" default_agent: "whatsapp-assistant"
agents: support-agent.agent.yaml: | apiVersion: astromesh/v1 kind: Agent metadata: name: support-agent spec: identity: display_name: "Support Agent" model: primary: provider: ollama model: llama3.1:8b orchestration: pattern: react max_iterations: 5Each section becomes a ConfigMap mounted at /app/config/ inside the pod.
Secrets
Section titled “Secrets”Development: inline values
Section titled “Development: inline values”For development, set secret values directly in your values file:
secrets: create: true values: OPENAI_API_KEY: "sk-dev-key-here" WHATSAPP_VERIFY_TOKEN: "dev-verify-token" WHATSAPP_ACCESS_TOKEN: "dev-access-token"Production: existing Secret
Section titled “Production: existing Secret”For production, create a Kubernetes Secret separately and reference it:
kubectl create secret generic astromesh-secrets \ --from-literal=OPENAI_API_KEY="sk-prod-..." \ --from-literal=WHATSAPP_ACCESS_TOKEN="EAAx..." \ -n astromeshThen in values:
secrets: create: false existingSecret: "astromesh-secrets"External Database
Section titled “External Database”To use an external PostgreSQL instance (e.g., AWS RDS) instead of the subchart:
postgresql: enabled: false
externalDatabase: host: "your-rds-instance.region.rds.amazonaws.com" port: "5432" database: "astromesh" username: "astromesh" existingSecret: "astromesh-db-credentials" # Secret with key DATABASE_PASSWORDCreate the credentials secret:
kubectl create secret generic astromesh-db-credentials \ --from-literal=DATABASE_PASSWORD="your-rds-password" \ -n astromeshExternal Redis
Section titled “External Redis”To use an external Redis instance (e.g., AWS ElastiCache):
redis: enabled: false
externalRedis: host: "your-elasticache.region.cache.amazonaws.com" port: "6379" existingSecret: "astromesh-redis-credentials" # Secret with key REDIS_PASSWORDModel Serving
Section titled “Model Serving”vLLM (GPU inference server)
Section titled “vLLM (GPU inference server)”Deploy vLLM for high-throughput, GPU-accelerated inference:
vllm: enabled: true model: "mistralai/Mistral-7B-Instruct-v0.3" extraArgs: - "--max-model-len" - "4096" resources: requests: nvidia.com/gpu: "1" limits: nvidia.com/gpu: "1" nodeSelector: gpu: "true" tolerations: - key: nvidia.com/gpu operator: Exists effect: NoScheduleIf the model is gated (requires HuggingFace authentication):
vllm: huggingfaceToken: "hf_..." # Or reference an existing secret: # existingSecret: "hf-token"HuggingFace TEI (embeddings and reranking)
Section titled “HuggingFace TEI (embeddings and reranking)”Deploy Text Embeddings Inference for semantic search:
tei: enabled: true instances: - name: embeddings modelId: "BAAI/bge-small-en-v1.5" port: 8002 resources: limits: nvidia.com/gpu: "1" nodeSelector: gpu: "true" tolerations: - key: nvidia.com/gpu operator: Exists effect: NoSchedule
- name: reranker modelId: "BAAI/bge-reranker-base" port: 8003 resources: limits: nvidia.com/gpu: "1" nodeSelector: gpu: "true" tolerations: - key: nvidia.com/gpu operator: Exists effect: NoScheduleEach instances[] entry creates a separate Deployment and Service.
GPU Scheduling
Section titled “GPU Scheduling”GPU workloads (vLLM, TEI) need to be scheduled on nodes with GPUs. Configure per-service:
vllm: nodeSelector: gpu: "true" # Only schedule on GPU-labeled nodes tolerations: - key: nvidia.com/gpu operator: Exists effect: NoSchedule # Tolerate GPU node taints resources: requests: nvidia.com/gpu: "1" # Request 1 GPU limits: nvidia.com/gpu: "1" # Limit to 1 GPULabel your GPU nodes:
kubectl label node gpu-node-1 gpu=trueTaint GPU nodes to prevent non-GPU workloads:
kubectl taint nodes gpu-node-1 nvidia.com/gpu=:NoScheduleObservability
Section titled “Observability”Prometheus annotations
Section titled “Prometheus annotations”Enabled by default. The Astromesh Service gets Prometheus scrape annotations:
observability: prometheus: enabled: true # Adds prometheus.io/scrape: "true" to ServiceOpenTelemetry (manual endpoint)
Section titled “OpenTelemetry (manual endpoint)”Point Astromesh to an existing OTel collector:
observability: otel: enabled: true endpoint: "http://otel-collector.monitoring:4317"OpenTelemetry (subchart auto-wired)
Section titled “OpenTelemetry (subchart auto-wired)”Deploy an OTel Collector alongside Astromesh. The endpoint is auto-resolved:
opentelemetry-collector: enabled: true mode: deployment config: receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 exporters: debug: {} prometheus: endpoint: 0.0.0.0:8889 service: pipelines: traces: receivers: [otlp] exporters: [debug] metrics: receivers: [otlp] exporters: [prometheus]Full observability stack
Section titled “Full observability stack”Enable kube-prometheus-stack for Prometheus, Grafana, and AlertManager:
kube-prometheus-stack: enabled: true prometheus: prometheusSpec: serviceMonitorSelectorNilUsesHelmValues: false grafana: adminPassword: admin
opentelemetry-collector: enabled: true
observability: prometheus: enabled: true otel: enabled: trueAccess Grafana:
kubectl port-forward svc/astromesh-grafana 3000:80 -n astromeshOpen http://localhost:3000 (admin/admin).
Ingress
Section titled “Ingress”nginx + cert-manager example
Section titled “nginx + cert-manager example”ingress: enabled: true className: nginx annotations: cert-manager.io/cluster-issuer: letsencrypt-prod nginx.ingress.kubernetes.io/proxy-body-size: "10m" nginx.ingress.kubernetes.io/proxy-read-timeout: "120" hosts: - host: astromesh.example.com paths: - path: / pathType: Prefix tls: - secretName: astromesh-tls hosts: - astromesh.example.comVerify:
kubectl get ingress -n astromeshExpected output:
NAME CLASS HOSTS ADDRESS PORTS AGEastromesh nginx astromesh.example.com 203.0.113.10 80, 443 5mAutoscaling
Section titled “Autoscaling”Enable the Horizontal Pod Autoscaler:
autoscaling: enabled: true minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 70Verify:
kubectl get hpa -n astromeshExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEastromesh Deployment/astromesh 35%/70% 2 10 2 5mEnvironment Profiles
Section titled “Environment Profiles”The chart ships with three environment profiles:
| Setting | Dev | Staging | Prod |
|---|---|---|---|
| Replicas | 1 | 2 | 3 (HPA 3-10) |
| PostgreSQL | Subchart | Subchart | External (RDS) |
| Redis | Subchart | Subchart | External (ElastiCache) |
| Ollama | Subchart | — | — |
| vLLM | Enabled (no GPU limits) | Enabled (GPU) | Enabled (GPU) |
| TEI | Embeddings only | Embeddings | Embeddings + Reranker |
| Observability | Full stack (subcharts) | OTel only | OTel to external |
| Ingress | Disabled | Enabled | Enabled + TLS |
| Secrets | Inline | Inline | existingSecret |
| Resources | 250m/512Mi | 500m/1Gi | 1/2Gi - 4/4Gi |
Install with a specific profile:
# Developmenthelm install astromesh ./deploy/helm/astromesh -f deploy/helm/astromesh/values-dev.yaml -n astromesh-dev --create-namespace
# Staginghelm install astromesh ./deploy/helm/astromesh -f deploy/helm/astromesh/values-staging.yaml -n astromesh-staging --create-namespace
# Productionhelm install astromesh ./deploy/helm/astromesh -f deploy/helm/astromesh/values-prod.yaml -n astromesh-prod --create-namespaceExternal Secrets (ESO)
Section titled “External Secrets (ESO)”For production secret management, use the External Secrets Operator to sync secrets from AWS Secrets Manager, GCP Secret Manager, or HashiCorp Vault.
Prerequisite: ESO must be installed in your cluster. It is a cluster-level operator, not deployed per-application.
SecretStore setup
Section titled “SecretStore setup”externalSecrets: enabled: true refreshInterval: 1h secretStore: enabled: true kind: SecretStore provider: aws: service: SecretsManager region: us-east-1 auth: secretRef: accessKeyIDSecretRef: name: aws-credentials key: access-key-id secretAccessKeySecretRef: name: aws-credentials key: secret-access-keyExternalSecret keys
Section titled “ExternalSecret keys”externalSecrets: keys: - secretKey: OPENAI_API_KEY remoteRef: key: astromesh/openai property: api_key - secretKey: WHATSAPP_ACCESS_TOKEN remoteRef: key: astromesh/whatsapp property: access_token - secretKey: DATABASE_PASSWORD remoteRef: key: astromesh/database property: passwordWhen externalSecrets.enabled=true, the ExternalSecret resource creates a Kubernetes Secret with the same name as the Astromesh release. Set secrets.create: false to avoid conflicts with the inline secret.
Provider examples
Section titled “Provider examples”AWS Secrets Manager:
provider: aws: service: SecretsManager region: us-east-1 auth: secretRef: accessKeyIDSecretRef: name: aws-credentials key: access-key-id secretAccessKeySecretRef: name: aws-credentials key: secret-access-keyGCP Secret Manager:
provider: gcpsm: projectID: my-gcp-project auth: secretRef: secretAccessKeySecretRef: name: gcp-credentials key: secret-access-credentialsHashiCorp Vault:
provider: vault: server: https://vault.example.com path: secret version: v2 auth: kubernetes: mountPath: kubernetes role: astromeshThe chart installs four Custom Resource Definitions for Kubernetes-native agent management:
| CRD | Group | Kind | Scope |
|---|---|---|---|
agents.astromesh.io | astromesh.io | Agent | Namespaced |
providers.astromesh.io | astromesh.io | Provider | Namespaced |
channels.astromesh.io | astromesh.io | Channel | Namespaced |
ragpipelines.astromesh.io | astromesh.io | RAGPipeline | Namespaced |
All CRDs use API version v1alpha1 with a status subresource.
Create an Agent via kubectl
Section titled “Create an Agent via kubectl”kubectl apply -f - <<EOFapiVersion: astromesh.io/v1alpha1kind: Agentmetadata: name: support-agent namespace: astromeshspec: identity: display_name: "Support Agent" description: "Handles customer support queries" model: primary: provider: ollama model: llama3.1:8b orchestration: pattern: react max_iterations: 5EOFExpected output:
agent.astromesh.io/support-agent createdList agents
Section titled “List agents”kubectl get agents -n astromeshExpected output:
NAME AGEsupport-agent 10sCreate a Provider
Section titled “Create a Provider”kubectl apply -f - <<EOFapiVersion: astromesh.io/v1alpha1kind: Providermetadata: name: openai namespace: astromeshspec: type: openai_compat endpoint: "https://api.openai.com/v1" api_key_env: OPENAI_API_KEY models: - gpt-4o - gpt-4o-miniEOFCreate a Channel
Section titled “Create a Channel”kubectl apply -f - <<EOFapiVersion: astromesh.io/v1alpha1kind: Channelmetadata: name: whatsapp namespace: astromeshspec: type: whatsapp default_agent: support-agent rate_limit: window_seconds: 60 max_messages: 30EOFCreate a RAGPipeline
Section titled “Create a RAGPipeline”kubectl apply -f - <<EOFapiVersion: astromesh.io/v1alpha1kind: RAGPipelinemetadata: name: docs-search namespace: astromeshspec: embeddings: provider: tei endpoint: "http://astromesh-tei-embeddings:8002" vector_store: type: pgvector chunking: strategy: recursive chunk_size: 512 overlap: 50EOFNote: CRD definitions are installed with the chart, but the reconciliation controller is a separate project. Currently, CRDs serve as documentation of intent and can be used by external automation.
Useful Commands
Section titled “Useful Commands”Install
Section titled “Install”helm install astromesh ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-dev.yaml \ -n astromesh --create-namespaceUpgrade
Section titled “Upgrade”helm upgrade astromesh ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-prod.yaml \ -n astromeshDry run (preview changes)
Section titled “Dry run (preview changes)”helm upgrade astromesh ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-prod.yaml \ -n astromesh --dry-run --debugTemplate rendering (no cluster needed)
Section titled “Template rendering (no cluster needed)”helm template astromesh ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-dev.yamlhelm lint ./deploy/helm/astromesh \ -f deploy/helm/astromesh/values-prod.yamlExpected output:
==> Linting ./deploy/helm/astromesh[INFO] Chart.yaml: icon is recommended
1 chart(s) linted, 0 chart(s) failedUninstall
Section titled “Uninstall”helm uninstall astromesh -n astromeshNote: CRDs are not removed on uninstall (Helm convention). Remove manually if needed:
kubectl delete crd agents.astromesh.io providers.astromesh.io channels.astromesh.io ragpipelines.astromesh.ioView installed values
Section titled “View installed values”helm get values astromesh -n astromeshView all resources
Section titled “View all resources”kubectl get all -n astromeshExpected output:
NAME READY STATUS RESTARTS AGEpod/astromesh-5d8f9c7b6-x2k4m 1/1 Running 0 5mpod/astromesh-postgresql-0 1/1 Running 0 5mpod/astromesh-redis-master-0 1/1 Running 0 5m
NAME TYPE CLUSTER-IP PORT(S)service/astromesh ClusterIP 10.96.100.1 8000/TCPservice/astromesh-postgresql ClusterIP 10.96.100.2 5432/TCPservice/astromesh-redis-master ClusterIP 10.96.100.3 6379/TCP
NAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/astromesh 1/1 1 1 5mProduction Checklist
Section titled “Production Checklist”Before going to production, verify the following:
- Secrets: Using
existingSecretor External Secrets, not inline values - Database: External managed PostgreSQL (RDS, Cloud SQL) with backups
- Redis: External managed Redis (ElastiCache, Memorystore) or Redis with persistence
- Ingress: TLS enabled with valid certificate (cert-manager or manual)
- Resources: CPU and memory requests/limits set for all workloads
- Autoscaling: HPA enabled with appropriate min/max replicas
- Observability: OTel tracing and Prometheus metrics enabled
- GPU: nodeSelector and tolerations set for GPU workloads
- Network policies: Restrict traffic between namespaces if required
- RBAC: ServiceAccount with minimal permissions
- Image: Using a specific image tag, not
latest - Backups: Database backup schedule configured