Deploying Gaby on Kubernetes¶
Run Gaby inside your cluster so it reaches your in-cluster systems (Postgres, Keycloak, internal APIs) over cluster DNS, no VPN, no egress, data never leaves your perimeter. This is the intended production topology, not a workaround.
Raw manifests live in ops/k8s/.
This guide uses in-cluster Postgres as the main store and a
single-replica backend (API + ticket poller + investigation consumer
in one process). Helm packaging is a v0.4 item; raw YAML works today.
Architecture in-cluster¶
Ingress (gaby.internal.example.com)
│
▼
gaby-web (Deployment, nginx + SPA, scalable)
│ proxies /api, /health, /ready, /metrics
▼
gaby-backend (Deployment, replicas=1)
├─ API
├─ ticket poller ──▶ your help desk
├─ investigation consumer
│ └─ RealToolDispatcher ─▶ MCP connectors ─▶ your Postgres / Keycloak (cluster DNS)
├─ PVC /var/lib/gaby (SQLite memory graph + bootstrap.url)
└─ GABY_DB_URL ─▶ in-cluster Postgres (main store)
Single backend replica is deliberate
The ticket poller is not deduped across pods, two backend replicas
would double-poll your ticket sources. The investigation consumer's claim
store is SELECT … FOR UPDATE SKIP LOCKED-safe, so the consumer side is
fine, but the poller isn't. Keep gaby-backend at replicas: 1. Scale the
stateless gaby-web freely. (A split API + 1-replica worker topology is a
v0.4 enhancement.)
Prerequisites¶
- A Kubernetes cluster +
kubectlcontext - An ingress controller (examples assume ingress-nginx) + a way to issue TLS (cert-manager or a pre-made
gaby-tlssecret) - In-cluster Postgres with a
gabydatabase + role (below) - A container registry you can push to (GHCR images aren't published until v0.3.0 is tagged)
- An
ANTHROPIC_API_KEY
1. Build + push the images¶
Until v0.3.0 is tagged (which triggers release.yml to publish to
ghcr.io/sky-cloak/gaby/{backend,web}), build them yourself:
REG=registry.internal.example.com/gaby # your registry
docker build -f ops/docker/Dockerfile.backend \
--build-arg GABY_VERSION=0.3.0 \
-t "$REG/gaby-backend:0.3.0" .
docker build -f ops/docker/Dockerfile.web \
-t "$REG/gaby-web:0.3.0" .
docker push "$REG/gaby-backend:0.3.0"
docker push "$REG/gaby-web:0.3.0"
Then set both image: fields in ops/k8s/backend.yaml and
ops/k8s/web.yaml to your registry path.
2. Provision the Postgres database¶
Gaby runs its own schema migrations at boot (schema_sync), but it won't
CREATE DATABASE. Create the database + role on your cluster Postgres:
Note the in-cluster DSN, e.g. postgres.db.svc.cluster.local:5432. The
driver prefix must be postgresql+asyncpg://.
Memory graph backend
The manifests default GABY_MEMORY_BACKEND=sqlite (memory graph on the
data PVC) so you don't need the Apache AGE extension. To run graph-native
memory in Postgres, set GABY_MEMORY_BACKEND=postgres-age +
GABY_MEMORY_POSTGRES_DSN against an AGE-enabled database. The main
relational store (GABY_DB_URL) is Postgres either way.
3. Create the namespace + secrets¶
kubectl apply -f ops/k8s/namespace.yaml
kubectl -n gaby create secret generic gaby-secrets \
--from-literal=ANTHROPIC_API_KEY="sk-ant-..." \
--from-literal=GABY_SESSION_SECRET="$(openssl rand -hex 32)" \
--from-literal=GABY_DB_URL='postgresql+asyncpg://gaby:a-strong-password@postgres.db.svc.cluster.local:5432/gaby'
GABY_SESSION_SECRET is the envelope-encryption key for connector configs
and escalation-channel credentials at rest, set it once and keep it
stable; rotating it invalidates every stored secret.
(Or copy ops/k8s/secret.example.yaml → secret.yaml, fill it in, and
kubectl apply -f it. secret.yaml is gitignored.)
4. Edit config + apply¶
Edit ops/k8s/configmap.yaml, set GABY_PUBLIC_WEB_URL to your ingress
host (so the first-run bootstrap link is reachable). Edit
ops/k8s/ingress.yaml, set the host + TLS secret. Then:
kubectl apply -f ops/k8s/configmap.yaml
kubectl apply -f ops/k8s/backend.yaml
kubectl apply -f ops/k8s/web.yaml
kubectl apply -f ops/k8s/ingress.yaml
5. Bootstrap the admin¶
Open the printed URL (it uses GABY_PUBLIC_WEB_URL), create your admin
account, log in. Confirm the consumer started:
6. Connect in-cluster systems (the payoff)¶
Now connectors point at cluster-internal hostnames, no VPN. In Connectors → Add → PostgreSQL, use the in-cluster DSN:
Same for Keycloak (http://keycloak.identity.svc.cluster.local:8080),
internal Redis, etc. Then follow
Your first investigation to queue a ticket and
watch Gaby investigate against your real systems.
Operations¶
| Task | Command |
|---|---|
| Logs | kubectl -n gaby logs -f deploy/gaby-backend |
| Shell | kubectl -n gaby exec -it deploy/gaby-backend -- bash |
| Verify audit chain | kubectl -n gaby exec deploy/gaby-backend -- gaby audit verify |
| Restart backend | kubectl -n gaby rollout restart deploy/gaby-backend |
| Scale web | kubectl -n gaby scale deploy/gaby-web --replicas=3 |
Do not scale the backend
kubectl scale deploy/gaby-backend --replicas=N (N>1) will double-poll
your ticket sources. Leave it at 1.
Troubleshooting¶
| Symptom | Cause | Fix |
|---|---|---|
Backend CrashLoopBackOff, logs mention DB |
gaby database/role missing or DSN wrong |
Create the DB (step 2); check the GABY_DB_URL secret |
investigation_consumer_no_provider_disabled |
ANTHROPIC_API_KEY not in the secret |
Add it, rollout restart deploy/gaby-backend |
Bootstrap URL points at localhost |
GABY_PUBLIC_WEB_URL not set |
Set it in the ConfigMap to your ingress host, restart |
SPA loads but /api calls 502 |
backend Service not named gaby-backend |
The web image's nginx hardcodes gaby-backend:8080, keep the Service name |
Pod stuck Pending on the PVC |
No default StorageClass | Set storageClassName in backend.yaml's PVC |
| Inbox events arrive batched | Ingress buffering the SSE stream | nginx.ingress.kubernetes.io/proxy-buffering: "off" (already in the sample ingress) |