Skip to content

Deploying Gaby on Kubernetes

Run Gaby inside your cluster so it reaches your in-cluster systems (Postgres, Keycloak, internal APIs) over cluster DNS, no VPN, no egress, data never leaves your perimeter. This is the intended production topology, not a workaround.

Raw manifests live in ops/k8s/. This guide uses in-cluster Postgres as the main store and a single-replica backend (API + ticket poller + investigation consumer in one process). Helm packaging is a v0.4 item; raw YAML works today.

Architecture in-cluster

        Ingress (gaby.internal.example.com)
        gaby-web  (Deployment, nginx + SPA, scalable)
              │  proxies /api, /health, /ready, /metrics
      gaby-backend  (Deployment, replicas=1)
        ├─ API
        ├─ ticket poller        ──▶ your help desk
        ├─ investigation consumer
        │     └─ RealToolDispatcher ─▶ MCP connectors ─▶ your Postgres / Keycloak (cluster DNS)
        ├─ PVC  /var/lib/gaby   (SQLite memory graph + bootstrap.url)
        └─ GABY_DB_URL ─▶ in-cluster Postgres (main store)

Single backend replica is deliberate

The ticket poller is not deduped across pods, two backend replicas would double-poll your ticket sources. The investigation consumer's claim store is SELECT … FOR UPDATE SKIP LOCKED-safe, so the consumer side is fine, but the poller isn't. Keep gaby-backend at replicas: 1. Scale the stateless gaby-web freely. (A split API + 1-replica worker topology is a v0.4 enhancement.)

Prerequisites

  • A Kubernetes cluster + kubectl context
  • An ingress controller (examples assume ingress-nginx) + a way to issue TLS (cert-manager or a pre-made gaby-tls secret)
  • In-cluster Postgres with a gaby database + role (below)
  • A container registry you can push to (GHCR images aren't published until v0.3.0 is tagged)
  • An ANTHROPIC_API_KEY

1. Build + push the images

Until v0.3.0 is tagged (which triggers release.yml to publish to ghcr.io/sky-cloak/gaby/{backend,web}), build them yourself:

REG=registry.internal.example.com/gaby   # your registry

docker build -f ops/docker/Dockerfile.backend \
  --build-arg GABY_VERSION=0.3.0 \
  -t "$REG/gaby-backend:0.3.0" .

docker build -f ops/docker/Dockerfile.web \
  -t "$REG/gaby-web:0.3.0" .

docker push "$REG/gaby-backend:0.3.0"
docker push "$REG/gaby-web:0.3.0"

Then set both image: fields in ops/k8s/backend.yaml and ops/k8s/web.yaml to your registry path.

2. Provision the Postgres database

Gaby runs its own schema migrations at boot (schema_sync), but it won't CREATE DATABASE. Create the database + role on your cluster Postgres:

CREATE ROLE gaby LOGIN PASSWORD 'a-strong-password';
CREATE DATABASE gaby OWNER gaby;

Note the in-cluster DSN, e.g. postgres.db.svc.cluster.local:5432. The driver prefix must be postgresql+asyncpg://.

Memory graph backend

The manifests default GABY_MEMORY_BACKEND=sqlite (memory graph on the data PVC) so you don't need the Apache AGE extension. To run graph-native memory in Postgres, set GABY_MEMORY_BACKEND=postgres-age + GABY_MEMORY_POSTGRES_DSN against an AGE-enabled database. The main relational store (GABY_DB_URL) is Postgres either way.

3. Create the namespace + secrets

kubectl apply -f ops/k8s/namespace.yaml

kubectl -n gaby create secret generic gaby-secrets \
  --from-literal=ANTHROPIC_API_KEY="sk-ant-..." \
  --from-literal=GABY_SESSION_SECRET="$(openssl rand -hex 32)" \
  --from-literal=GABY_DB_URL='postgresql+asyncpg://gaby:a-strong-password@postgres.db.svc.cluster.local:5432/gaby'

GABY_SESSION_SECRET is the envelope-encryption key for connector configs and escalation-channel credentials at rest, set it once and keep it stable; rotating it invalidates every stored secret.

(Or copy ops/k8s/secret.example.yamlsecret.yaml, fill it in, and kubectl apply -f it. secret.yaml is gitignored.)

4. Edit config + apply

Edit ops/k8s/configmap.yaml, set GABY_PUBLIC_WEB_URL to your ingress host (so the first-run bootstrap link is reachable). Edit ops/k8s/ingress.yaml, set the host + TLS secret. Then:

kubectl apply -f ops/k8s/configmap.yaml
kubectl apply -f ops/k8s/backend.yaml
kubectl apply -f ops/k8s/web.yaml
kubectl apply -f ops/k8s/ingress.yaml

5. Bootstrap the admin

kubectl -n gaby logs deploy/gaby-backend | grep first_run_bootstrap_minted

Open the printed URL (it uses GABY_PUBLIC_WEB_URL), create your admin account, log in. Confirm the consumer started:

kubectl -n gaby logs deploy/gaby-backend | grep investigation_consumer_started

6. Connect in-cluster systems (the payoff)

Now connectors point at cluster-internal hostnames, no VPN. In Connectors → Add → PostgreSQL, use the in-cluster DSN:

postgresql://readonly:pw@postgres.db.svc.cluster.local:5432/app

Same for Keycloak (http://keycloak.identity.svc.cluster.local:8080), internal Redis, etc. Then follow Your first investigation to queue a ticket and watch Gaby investigate against your real systems.

Operations

Task Command
Logs kubectl -n gaby logs -f deploy/gaby-backend
Shell kubectl -n gaby exec -it deploy/gaby-backend -- bash
Verify audit chain kubectl -n gaby exec deploy/gaby-backend -- gaby audit verify
Restart backend kubectl -n gaby rollout restart deploy/gaby-backend
Scale web kubectl -n gaby scale deploy/gaby-web --replicas=3

Do not scale the backend

kubectl scale deploy/gaby-backend --replicas=N (N>1) will double-poll your ticket sources. Leave it at 1.

Troubleshooting

Symptom Cause Fix
Backend CrashLoopBackOff, logs mention DB gaby database/role missing or DSN wrong Create the DB (step 2); check the GABY_DB_URL secret
investigation_consumer_no_provider_disabled ANTHROPIC_API_KEY not in the secret Add it, rollout restart deploy/gaby-backend
Bootstrap URL points at localhost GABY_PUBLIC_WEB_URL not set Set it in the ConfigMap to your ingress host, restart
SPA loads but /api calls 502 backend Service not named gaby-backend The web image's nginx hardcodes gaby-backend:8080, keep the Service name
Pod stuck Pending on the PVC No default StorageClass Set storageClassName in backend.yaml's PVC
Inbox events arrive batched Ingress buffering the SSE stream nginx.ingress.kubernetes.io/proxy-buffering: "off" (already in the sample ingress)