SPEC¶
The product specification at SPEC.md is the source of truth for what
Gaby is for, who it serves, and what features each persona expects.
This page renders that file verbatim.
Product Spec — Gaby¶
Status: Draft v0.1 · Owner: Guilliano · Last updated: 2026-04-11
This document is the source of truth for what we're building. The existing landing page (
index.html) is the marketing spec. The persona prototypes underpersonas/are the visual / interaction spec. This document is the functional spec — what the product must do.Name: Gaby. Decided. Used across the landing page, the persona prototypes, and all UI copy. This doc matches.
1. Vision — one sentence¶
An open-source AI teammate that installs into your own infrastructure, understands your systems and knowledge base, and resolves tier‑1 support tickets on its own — directly from your help desk, or through a human chat interface.
2. North star for every design decision¶
Every trade-off in this document is resolved against three rules, in order:
- Results for the user come first. If a feature sounds cool but doesn't shorten time‑to‑resolution for the person being helped, cut it.
- Plug and play, or it doesn't ship. The five‑minute install is the product. If a connector takes more than one screen of config, that's a bug.
- Flexible by default. No opinionated lock-in: any LLM, any help desk, any knowledge source, any deployment target. Opinionated defaults, yes; opinionated limits, no.
3. What the product actually does — end-to-end¶
┌─────────────────────────────────────────────────────────────────┐
│ 1. INSTALL Docker / Helm / single binary │
│ 2. CONNECT OAuth or API keys to your systems │
│ 3. INGEST Point it at your runbooks / KB / past │
│ tickets → it builds an index │
│ 4. AUTHORIZE Pick per-system scopes (read / write / │
│ dry-run) and per-team autonomy rules │
│ 5. ROUTE Plug into your help desk OR embed the │
│ human chat widget OR both │
│ 6. RUN Tickets come in → Gaby reads, investigates│
│ across connected systems, acts within │
│ its authorized scope, writes back to the │
│ ticket, and escalates when it can't │
│ 7. LEARN Every resolution becomes a new KB entry │
│ (with human review gate) │
└─────────────────────────────────────────────────────────────────┘
4. Brand & voice — what we already have¶
The name is Gaby, and it stays. The published docs site (docs/), all four persona prototypes, and the UI copy are already built around it. Changing the name would throw away real work and real intuition for zero gain — we'd be optimizing a brand instead of shipping a product.
Everything below is already locked in and must be respected by every future piece of work:
| Element | Source of truth | Notes |
|---|---|---|
| Product name | Gaby | Used across repo, docs, CLI (gaby …), Docker image, Helm chart, marketing, and in-product UI. |
| Tagline | "AI Support Agent That Actually Investigates" | From index.html. The word investigates is the core promise — Gaby doesn't just answer, it looks. |
| In-product voice | Warm, plainspoken, first-person-adjacent | "Gaby investigated this ticket…", "Ask Gaby…". Reference: the investigation detail in personas/support-lead/. |
| Logo / mark | Friendly circular face | icon('gaby', …) in shared/icons.js is canonical. Shown in every nav bar. |
| Persona color palette | Indigo (Founder), Violet (Support Lead), Emerald (SRE), Sky (MSP) | Already wired in shared/styles.css. Any new persona picks an unused Tailwind family. |
| Landing page | docs/index.md, deployed to https://gaby.skycloak.io |
This is the canonical marketing surface as of v0.3.1. Informational, not sales-y — explains what Gaby does and links to the rest of the docs. The previous repo-root index.html (Tailwind marketing prototype) was retired; its honest content moved here. If this spec ever contradicts the published landing page, the landing page wins until we explicitly decide otherwise. |
| Persona prototypes | personas/{founder,support-lead,sre,msp}/index.html |
These are the visual and interaction spec. Engineering turns them into real code; product doesn't re-litigate the flows. |
What we will not spend time debating¶
- A different name.
- A different tagline.
- A different logo.
- A different voice.
- Whether the landing page needs a rebuild.
Done. Energy goes into shipping v0.1.
5. Licensing strategy — OSS first, enterprise commercially¶
5.1 Goals¶
- Maximum adoption: a small MSP or a solo founder must be able to run the full core product, for free, forever, in production, without a conversation with sales.
- Defensible business: hyperscalers cannot rehost the project as a managed service and eat our lunch.
- Clarity for enterprise procurement: legal, security, and vendor-risk teams need something they recognize.
5.2 Recommended model — open core + commercial Enterprise Edition¶
| Component | License | What's in it |
|---|---|---|
| Core | Apache 2.0 | Agent runtime, connector framework, MCP client/host, LLM abstraction, knowledge ingestion, CLI, web UI, all four persona workflows, every ticket-source adapter, human chat widget, playbooks. |
| Enterprise Edition (EE) | Commercial (per-seat / per-workspace) | SSO/SAML, SCIM provisioning, audit log export (SIEM), multi-workspace RBAC beyond the OSS defaults, air-gapped deployment bundle, SOC2 evidence pack, priority support, SLA guarantees, white-label branding (for MSPs). |
| Connectors | Apache 2.0 | All connectors live in Core. No "you need EE to connect to Salesforce" games — this is how Grafana lost goodwill and PostHog kept it. |
5.3 Why not AGPL?¶
AGPL solves the "AWS clone" problem but makes corporate legal teams nervous — which kills adoption in the exact segment we need (SMB IT and mid-market support). Apache 2.0 for the core, with the network-service threat handled by keeping the valuable EE features closed-source, is the GitLab / PostHog / Grafana pattern and it has been proven at scale.
5.4 Trademark policy¶
Project name is trademark-protected. Forks and derivatives are fine; calling your fork by the same name is not. Document this in TRADEMARK.md from day one.
5.5 Contribution model¶
- CLA: lightweight DCO (Developer Certificate of Origin,
git commit -s) — not a full CLA. Lower barrier, still legally sufficient for re-licensing if ever needed. - Governance: BDFL → Steering Committee once we have 10+ external contributors shipping regularly.
6. Cross-cutting platform features — required in v1.0 for every persona¶
These are the features the product must have regardless of who is installing it. If any of these are missing, no persona can succeed.
6.1 Installation & deployment¶
| ID | Feature | Why it matters |
|---|---|---|
| INST-1 | Single docker compose up install with sane defaults (embedded SQLite, no external DB required to start). |
The founder / small MSP needs to be running in 5 minutes, on a laptop if needed. |
| INST-2 | Helm chart with values for storage class, ingress, secrets provider, resource limits. | The SRE / enterprise install must feel native on Kubernetes. |
| INST-3 | Single static binary (Go or Rust) for air-gapped / edge installs. | Some SMB environments have no Docker or K8s. |
| INST-4 | Managed SaaS at gaby.cloud (or whatever we brand it) for users who want zero ops. |
Most users eventually want the hosted option; offering it early keeps them in our ecosystem. |
| INST-5 | First-run wizard in the web UI mirroring the onboarding flows in the persona prototypes. | The wizards in personas/*/index.html are the spec — turn them into real code. |
| INST-6 | Upgrade path: gaby upgrade that handles schema migrations and connector compatibility checks. |
Painless upgrades are the difference between a loved OSS project and a dead one. |
6.2 Connector framework (MCP-native)¶
| ID | Feature | Why it matters |
|---|---|---|
| CON-1 | MCP-first architecture. Every connector is an MCP server. Users can plug in any MCP server from the ecosystem without writing code. | Rides the open standard. No vendor lock-in on tools. |
| CON-2 | First-party connector catalog for: PostgreSQL, MySQL, MongoDB, Redis, Kubernetes, AWS, GCP, Azure, Microsoft 365, Entra ID, Intune, Active Directory, Keycloak, Auth0, Okta, Slack, Teams, Stripe, GitHub, GitLab, Datadog, Grafana, PagerDuty, Sentry, SharePoint, FortiGate, NinjaOne, Kaseya, Zendesk, HaloPSA, Autotask, ConnectWise, Jira Service Management, Linear, Freshdesk, Intercom, Zoho Desk. | Covers the stacks of all four personas out of the box. No "but does it work with my X?" conversations. |
| CON-3 | Per-connector scope controls (read_only, read_write, dry_run, custom RBAC scopes). |
Safety. SREs will never install something that might DROP TABLE. |
| CON-4 | Secrets vault integration (HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, env-file). | No plaintext secrets in config. Enterprise procurement will reject otherwise. |
| CON-5 | Health checks + auto-reconnect per connector with visible status in the UI. | Users need to know at a glance whether Gaby can actually reach their systems. |
| CON-6 | Connector SDK in Python and TypeScript (and Go later) with a "hello world" connector in <100 lines. | Long-term connector growth is community-driven. Make it trivial to contribute. |
6.3 Knowledge ingestion¶
| ID | Feature | Why it matters |
|---|---|---|
| KB-1 | Point-and-ingest from: Markdown directories, Git repos, Confluence, Notion, Google Docs, SharePoint, GitHub wikis, PDF folders, URL crawls. | The knowledge base is the product's IQ. Ingestion must be as frictionless as gaby ingest ./docs. |
| KB-2 | Past-ticket learning: pull the last N resolved tickets from the connected help desk and index them as "how we've solved this before". | Every company's best runbook is its own ticket history. Nobody else can offer this without being connected. |
| KB-3 | Incremental re-indexing — only re-embed what changed. | Cost and latency matter. Re-indexing 10k docs nightly is not acceptable. |
| KB-4 | Hybrid retrieval: BM25 + vector + (optional) graph, with per-query explainability ("why did you cite this?"). | Users distrust opaque RAG. Show your work. |
| KB-5 | Citations in every answer. Every claim in a resolution must link back to the source (runbook, ticket, doc). | Trust. Auditability. Enterprise compliance. |
| KB-6 | Stale-content detection. Flag KB entries that haven't been used or touched in 6+ months. | KBs rot. The tool should actively help keep them alive. |
6.4 LLM layer — fully pluggable¶
| ID | Feature | Why it matters |
|---|---|---|
| LLM-1 | Managed mode: "Gaby Cloud" provides the LLM, users bring zero keys. | The happiest path. No signup with OpenAI/Anthropic before getting value. |
| LLM-2 | BYOK: Anthropic, OpenAI, Azure OpenAI, Google Vertex, AWS Bedrock, any OpenAI-compatible endpoint. | Lots of users have existing contracts or preferred vendors. |
| LLM-3 | Local models: Ollama, vLLM, llama.cpp, any OpenAI-compatible local server. | Privacy-sensitive industries (legal, healthcare, defense, EU) must never send data out. |
| LLM-4 | Per-workflow model routing: use a cheap/fast model for classification, the flagship for the final answer. | Cost control without sacrificing quality. |
| LLM-5 | Token budget per investigation with a hard kill-switch. | Avoids runaway costs from a stuck loop. |
| LLM-6 | Prompt versioning & observability — every prompt that goes to an LLM is versioned and inspectable in the admin UI. | Debuggability and trust. |
6.5 Safety, authorization, and "the machine will not wreck production"¶
| ID | Feature | Why it matters |
|---|---|---|
| SAF-1 | Three-mode autonomy per connector AND per ticket-source: investigate (read-only), propose (drafts a fix, a human applies), act (applies it itself, with rollback). |
The MSP / SRE personas need this. Without it, nobody lets the agent near production. |
| SAF-2 | Dry-run mode: every write action has a --dry-run shadow execution that shows the diff without applying it. |
Trust-building: users can watch the agent work for a week before turning on act. |
| SAF-3 | Approval queue — actions above a configurable "risk score" go to a human queue (Slack, Teams, email, or web UI). | Surgical safety. High-risk actions get eyes; low-risk don't. |
| SAF-4 | Allow/deny lists per connector, per action, per workspace. | "Gaby can restart pods in staging but never in prod". |
| SAF-5 | Append-only audit log with cryptographic chaining. Exportable to SIEM (EE feature). | Every action must be forensically reconstructible. Non-negotiable for enterprise. |
| SAF-6 | Rollback / undo where the underlying system supports it (Kubernetes restore, DB point-in-time, Entra ID session restore, etc.). | Mistakes happen. The product must make them cheap to reverse. |
6.6 Ticket sources & human chat interface¶
| ID | Feature | Why it matters |
|---|---|---|
| SRC-1 | Help desk adapters that listen for new tickets, post replies, update status, and log time entries: Zendesk, Freshdesk, Zoho Desk, Intercom, HaloPSA, Autotask, ConnectWise, Syncro, Jira Service Management, Linear, GitHub Issues. | This is how Gaby meets users where they already are. |
| SRC-2 | Inbound email — Gaby can read a mailbox (support@) and treat incoming messages as tickets. |
For companies without a help desk yet. |
| SRC-3 | Human chat widget — drop-in JS snippet for any website that gives end-users a chat UI. Gaby answers directly; escalates to a human when needed. | This is the "direct to end-user" path. |
| SRC-4 | Slack / Teams app — end-users can DM the bot inside their workspace and get the same experience. | Internal IT use case (SMB IT helpdesk, MSP clients). |
| SRC-5 | Unified inbox UI in the Gaby web app — every conversation from every source in one view, with source badge. | For the human operator. One pane of glass. |
| SRC-6 | Webhook in/out — Gaby can be triggered by any webhook, and can fire webhooks on ticket events. | For the 10% of users with a custom stack. |
6.7 Observability & operations¶
| ID | Feature | Why it matters |
|---|---|---|
| OBS-1 | OpenTelemetry traces for every investigation — spans per tool call, per LLM call, per action. | SREs will insist. Also makes the product debuggable for us. |
| OBS-2 | Prometheus /metrics endpoint with rates, latencies, error counts, token usage. |
Standard ops hygiene. |
| OBS-3 | Structured JSON logs to stdout. No log files on disk by default. | Twelve-factor. Friendly to every log aggregator. |
| OBS-4 | Built-in status page at /status showing connector health, LLM latency, queue depth. |
For the team running Gaby in-house. |
| OBS-5 | Cost dashboard — per-connector, per-persona, per-client LLM + infra cost attribution. | "How much is Gaby costing me?" is the first question every founder asks after week 2. |
6.8 Governance & multi-workspace¶
| ID | Feature | Why it matters |
|---|---|---|
| GOV-1 | Multi-workspace isolation — strong (cryptographic) separation of data between workspaces (for the MSP persona, workspaces = clients). | The MSP persona lives or dies by this. |
| GOV-2 | Role-based access control — Admin, Agent, Viewer, plus custom roles (EE). | Every persona needs at least the three defaults. |
| GOV-3 | Data residency controls — pin data to a region. | EU/UK/CA customers will ask. |
| GOV-4 | PII redaction on the way in — configurable patterns + named-entity recognition. | GDPR, HIPAA, and "don't send customer credit cards to the LLM". |
| GOV-5 | Consent logging — record who authorized what connector and when. | Audit, compliance, and accidental-revocation recovery. |
7. Per-persona feature requirements¶
Each persona section says what that installer needs on top of the cross-cutting features above. These are sorted so that the smallest, simplest personas come first.
7.1 Technical Founder / CTO¶
Who: Solo or small founding team of a SaaS company. Tech-native, already runs Kubernetes or Docker, has their own database. Gets paged by customers at 3 a.m. for things that shouldn't page them.
Success metric: "I got a full night's sleep last week and no customer noticed."
| ID | Feature | Notes |
|---|---|---|
| FND-1 | docker compose up → running in 5 minutes with SQLite, no external dependencies. |
Aligns with personas/founder/index.html step 1. |
| FND-2 | Pre-built Helm chart (helm install gaby gaby/agent) with a 10-line values.yaml that covers 90% of cases. |
|
| FND-3 | One-command connect to the usual founder stack: gaby connect postgres, gaby connect keycloak, gaby connect stripe. |
Each prompts for URL / API key, tests the connection, and is ready. |
| FND-4 | Nightly summary email: "While you slept: 8 resolved, 1 escalated, here's the one." | The emotional win of this persona is "I can sleep now". Surface it explicitly. |
| FND-5 | GitHub/GitLab source-code context: when a ticket references a bug, Gaby can read the relevant file and tell the founder "this is in billing/stripe_webhook.go:142". |
The founder is their own L2 — help them get to the code faster. |
| FND-6 | Zero-config sensible defaults — the wizard should pick everything it can, and ask only what it cannot. | The founder is impatient. Every question that could be auto-detected must be. |
| FND-7 | Escalate mode: Slack DM to the founder, with a one-click "open ticket in Zoho Desk / Linear" button. | |
| FND-8 | Ticketing compatibility: Zoho Desk, Linear, GitHub Issues, Jira, plain email inbox. |
7.2 Head of Support / Support Lead¶
Who: Leads an L1 support team at a mid-sized SaaS. Their team handles 1-2k tickets/month. Their pain is that L1 can't resolve without bouncing to engineering, which burns engineer time and slows the customer.
Success metric: "My L1 team resolves 75% of tickets without pinging an engineer."
| ID | Feature | Notes |
|---|---|---|
| SUP-1 | Zendesk native sidebar app — Gaby's investigation renders inside the Zendesk ticket view. Agents never leave the tool they already know. | The prototype in personas/support-lead/ shows this mockup. Make it real. |
| SUP-2 | Same for: Freshdesk, Intercom, HelpScout, Zoho Desk, Jira Service Management. | The persona's team may be on any of these. |
| SUP-3 | Plain-language mode — Gaby writes resolutions in customer-friendly language, not SRE jargon. Configurable tone (formal / friendly / technical). | The investigation detail in the prototype already demonstrates this; make it a first-class config. |
| SUP-4 | One-click apply buttons inside the investigation ("Reset this user's password", "Resend the email"), gated by the agent's role and the connector's scope. | |
| SUP-5 | Team analytics: per-agent resolution rate, avg time, customer satisfaction score, leaderboard. | Already mocked in the prototype's dashboard. |
| SUP-6 | Escalation rules engine: "if the customer is on an Enterprise plan AND the ticket touches billing, always escalate to a human, never auto-resolve". | Support leads need fine-grained control over where autonomy applies. |
| SUP-7 | Playbook library — YAML playbooks for recurring customer issues ("account merge", "bulk user import", "refund request"). Community-contributed. | Results-first: most L1 tickets are the same 30 problems. Make those 30 instant. |
| SUP-8 | SLA awareness — Gaby reads the ticket's SLA deadline and prioritizes accordingly. | |
| SUP-9 | CSAT feedback loop — when a customer rates a Gaby-resolved ticket poorly, it automatically enters a review queue. | Quality signal. Prevents drift. |
7.3 SRE / DevOps Lead¶
Who: Runs the infra team at a mid-market or larger company. Constantly paged for things that turn out not to be infra. Spends half their triage time proving the issue isn't theirs.
Success metric: "34 tickets this week were NOT infra, and nobody paged me."
| ID | Feature | Notes |
|---|---|---|
| SRE-1 | Read-only by default for every connector that touches production. act mode requires an explicit allowlist and a per-action approval. |
SREs will not install a tool that can kubectl delete. |
| SRE-2 | "Is this infra?" classifier as a first-class output: infra / not-infra / sre-needed with an evidence bundle. |
Already prototyped in personas/sre/. This is the product for this persona. |
| SRE-3 | Investigation depth selector (Quick / Standard / Deep) — trades cost and latency for thoroughness. Already in the prototype. | |
| SRE-4 | Deploy correlation: automatically check recent deploys (GitHub Actions, ArgoCD, Flux, Spinnaker) and correlate with incident start time. | |
| SRE-5 | Runbook execution — Gaby can read a Markdown runbook and execute its steps one at a time, with approval between each. | Turns every runbook into a semi-autonomous workflow for free. |
| SRE-6 | PagerDuty / Opsgenie ingestion — alerts become "tickets" that Gaby investigates before the human is paged. | Saves 3 a.m. wake-ups for no reason. |
| SRE-7 | Handoff package: when Gaby escalates to a human, it produces a structured "evidence bundle" (metrics screenshots, logs, relevant dashboards, timeline) attached to the incident. | |
| SRE-8 | Connectors required: Kubernetes, Datadog, Grafana, Prometheus, PagerDuty, GitHub Actions, ArgoCD, Terraform Cloud. | |
| SRE-9 | Read-only mode for the LLM itself: option to use a self-hosted local model so no infra telemetry ever leaves the cluster. | Compliance + IP protection. |
7.4 MSP / Managed IT Helpdesk¶
Who: Runs a managed service provider serving 20-100 SMB clients (law firms, accounting, dental, small manufacturing). Their L1 techs drown in tickets across every client's workspace.
Success metric: "62% of tickets auto-resolved across 24 SMB clients — and every one of my techs got Fridays back."
| ID | Feature | Notes |
|---|---|---|
| MSP-1 | Multi-workspace isolation — one Gaby install serves all clients, but no data ever crosses workspaces. Cryptographic separation, not just a workspace_id column. |
This is the #1 dealbreaker for MSPs. Get it right. |
| MSP-2 | Multi-workspace connector auth — one "Connect Microsoft 365" flow discovers all client workspaces via delegated admin or GDAP. Same for Google Workspace, NinjaOne, Autotask, HaloPSA. | A tech should not have to authenticate 24 times. |
| MSP-3 | Per-client autonomy level: Auto / Review / Investigate-only. Already prototyped in personas/msp/ step 5. |
Different clients have different appetites for AI. Respect that per-client, not just per-install. |
| MSP-4 | PSA integration with billable time writeback: Gaby logs its own work against the client's contract, with configurable billing rules (e.g., "don't bill for <2 min"). | MSPs are time-billing businesses. This is table stakes. |
| MSP-5 | RMM script execution via NinjaOne, Kaseya VSA, Datto RMM, ConnectWise Automate — with the same safety gates as other connectors. | |
| MSP-6 | Per-client knowledge base isolation. Hartwell Law's runbooks must not leak into Dearborn CPA's answers. | Privacy, confidentiality, sometimes legally required. |
| MSP-7 | Per-client compliance profiles: HIPAA for dental/medical, SOC2 for finance, PIPEDA for Canadian clients. Profiles drive redaction rules, audit verbosity, and allowed LLM providers. | MSPs serve regulated industries and cannot manually configure this for every client. |
| MSP-8 | Client onboarding templates: "this is our Dental Office package — M365 Business Premium, Intune, these 6 playbooks, this KB bundle". Spin up a new client in minutes. | Scales the MSP's operational playbook. |
| MSP-9 | White-label / co-branding (EE feature): the MSP can brand Gaby as their own ("BlueMesa IT Assistant") in the client-facing chat widget. | Some MSPs want the AI to look like their own product. Let them. |
| MSP-10 | Per-client cost attribution — see exactly how much Gaby cost per client, per month, per user. | Essential for pricing and contract margins. |
| MSP-11 | Cross-client anomaly detection: "this same 'VPN cert expired' error is hitting 4 different clients this week — looks like a common update issue". | Force multiplier unique to the MSP context. Very differentiating. |
8. The "human chat interface" — design notes¶
The user asked for either direct ticket resolution or a human chat interface. Both must exist as first-class modes.
8.1 End-user-facing chat widget¶
- Drop-in JavaScript snippet for any website or web app.
- Theming: CSS variables + a JSON theme file. Works out of the box; white-labelable with one config file.
- Persona: defaults to "Gaby" but is fully renameable per deployment (so the MSP can call it "BlueMesa Helper", the SaaS company can call it "Acme Support").
- Conversation state: stored server-side, tied to the end-user's identity if the host provides one, anonymous otherwise.
- Escalation: when Gaby can't resolve, the widget seamlessly hands off to a human agent (who sees the full transcript + Gaby's investigation so far, not a cold start).
- Offline behavior: when no human is available, Gaby says so and offers to create a ticket.
8.2 Internal-team chat interface (Slack / Teams app)¶
- Same engine, different surface.
- Users DM the bot; the bot investigates and replies in-thread.
@gabymentions in shared channels work the same.- Admins can configure which channels Gaby listens to.
8.3 Operator console (for the human behind the scenes)¶
- Unified inbox across all sources.
- "Shadow mode" — watch Gaby work in real time on a live conversation, take over with one click.
- Correction UI — "actually, the right answer was X" feeds back into the KB (with gating).
9. Already decided — don't re-open¶
These are closed. Re-opening any of them without new information is wasted energy.
- Name: Gaby. See Section 4.
- Tagline, logo, voice: as shipped on the landing page. See Section 4.
- Landing page:
index.htmlis canonical. Don't rebuild it. - Persona prototypes: the four pages under
personas/are canonical for visual and interaction design. Engineering turns them into real code; product doesn't re-litigate flows that are already drawn. - Licensing model: Apache 2.0 core + commercial Enterprise Edition. See Section 5.
10. Open questions — still need a decision before v0.1¶
- Core language: Go (fastest single-binary story, SRE-friendly) or Python (larger AI ecosystem, faster iteration on the agent loop)? Recommendation: Python for the agent core, Go for the CLI and for connectors that benefit from a static binary.
- Primary storage: SQLite out of the box, with a one-line switch to Postgres for scale? Recommendation: yes.
- Vector store: embedded (sqlite-vec / LanceDB) or external (Postgres pgvector / Qdrant)? Recommendation: embedded by default, external opt-in.
- Managed-cloud strategy: do we launch the managed SaaS at the same time as the OSS, or wait 3–6 months? Recommendation: OSS first, managed 3 months later — OSS bugs are cheap, managed bugs are expensive.
- Which persona ships first? Recommendation: Founder first (smallest scope, most forgiving users, fastest to "wow"). Then MSP (biggest TAM and the trigger for this whole exercise). Then Support Lead. SRE last (highest safety bar).
- Plugin marketplace vs. monorepo for connectors. Recommendation: monorepo for v1.0, extract to a marketplace once we have 30+ community connectors.
- Pricing for EE. Per-seat? Per-workspace? Per-ticket? Recommendation: per-seat for SaaS-style personas, per-client-workspace for MSP, flat Enterprise for large deployments.
11. What this document is not¶
- Not a technical architecture doc. That comes next (
ARCHITECTURE.md). - Not a roadmap with dates. That comes next (
ROADMAP.md). - Not a UX spec — the persona prototypes under
personas/are the UX spec. Treat them as canonical for visual and interaction design. - Not a marketing brief — the landing page (
index.html) is the marketing spec. - Not a naming debate — see Section 4.
12. Appendix — the four personas at a glance¶
| Persona | User | Core pain | Primary success metric | Ships when |
|---|---|---|---|---|
| Technical Founder | Solo/small SaaS CTO | 3 a.m. pages for things that shouldn't page | Full night's sleep, customers don't notice | v0.1 |
| MSP Helpdesk | Owner/tech lead at 20-client MSP | L1 drowns in the same tickets across 24 workspaces | 62% auto-resolve, techs reclaim Fridays | v0.2 |
| Head of Support | L1 team lead at mid-market SaaS | L1 can't resolve without engineer escalation | 75% L1 resolution, 3x ticket throughput | v0.3 |
| SRE / DevOps Lead | Infra lead at larger company | Triage time wasted proving issues aren't infra | "Not infra" verdicts delivered without paging SRE | v0.4 |