Self-Hosting Guide
Deploy ZenSearch on your own infrastructure with full control over your data, AI models, and network configuration. The free Developer Edition is available for evaluation and small teams. Enterprise On-Premise deployment is available for production workloads.
:::tip Developer Edition The Developer Edition is a free, self-contained Docker Compose deployment — no license key required. Jump to quickstart.
For production deployments with Kubernetes, SSO, and dedicated support, contact [email protected]. :::
Overview
ZenSearch's on-premise deployment gives you:
- Full data sovereignty — your data never leaves your network
- Air-gapped support — deploy without internet connectivity
- Bring your own LLM — use OpenAI, Anthropic, Cohere, Ollama, or any OpenAI-compatible endpoint
- Infrastructure control — deploy on AWS, GCP, Azure, or bare metal
- Custom networking — configure VPCs, firewalls, and service mesh as needed
Developer Edition Quickstart
The Developer Edition is a free, self-contained Docker Compose deployment for evaluation and development.
Prerequisites
- Docker Desktop (or Docker Engine + Docker Compose v2)
- 8 GB RAM minimum, 16 GB+ recommended
linux/amd64orlinux/arm64host — Apple Silicon (macOS Docker Desktop), AWS Graviton, Qualcomm Snapdragon X via WSL2, and Raspberry Pi 4/5 with 64-bit OS are all supported natively. 32-bit ARM (armv7) is not supported.- No API key required — the installer can auto-install Ollama and pull local models sized for your hardware. Bring your own LLM (OpenAI, Anthropic, Groq, Ollama already installed, LM Studio, or any OpenAI-compatible endpoint) is also supported.
Install
curl -fsSL https://releases.zensearch.ai/install.sh | bash
The installer downloads the latest release, prompts you to pick an AI provider, and starts all services. If you pick the default "Local Setup" option it auto-installs Ollama and pulls chat + embedding models sized for your available RAM (see Recommended Ollama models).
After a few minutes: Web UI at http://localhost:35173, API health at http://localhost:38080/health.
The Developer Edition binds the Web UI and API to 127.0.0.1 on high-numbered host ports (35173 and 38080) to avoid collisions with other dev tools (React dev servers on 5173, Rails on 3000, etc.). Override WEB_PORT / API_PORT in your .env if you want different ports. Set LITE_BIND_ADDR=0.0.0.0 only on a trusted network after enabling authentication.
Recommended Ollama models (Local Setup)
The installer detects total RAM and GPU VRAM (via nvidia-smi or rocm-smi, independent of Docker NVIDIA toolkit) and picks a chat + embedding pair that leaves real headroom for the OS, Docker, and the ZenSearch stack. All chat picks come from the qwen3.5 family — the April 2026 default with a full size ladder (0.8b → 122b) and confirmed tools + thinking + vision support.
When a usable GPU is present, weights and KV cache live in VRAM, so a modest-RAM machine with a 16 GB GPU can run qwen3.5:9b @ 32K context comfortably. Without a GPU, the picker falls back to a RAM-only ladder — CPU inference is slow regardless of model size, so we pick what fits without forcing the user into swap.
GPU-first ladder (when ≥ 8 GB VRAM detected):
| GPU VRAM | Chat | Context | Embedding |
|---|---|---|---|
| ≥ 48 GB | qwen3.5:35b (22 GB) | 32K | mxbai-embed-large (670 MB) |
| ≥ 24 GB | qwen3.5:27b (17 GB) | 16K | mxbai-embed-large (670 MB) |
| ≥ 16 GB | qwen3.5:9b (5.5 GB) | 32K | mxbai-embed-large (670 MB) |
| ≥ 12 GB | qwen3.5:9b (5.5 GB) | 16K | mxbai-embed-large (670 MB) |
| ≥ 8 GB | qwen3.5:4b (2.5 GB) | 16K | nomic-embed-text (274 MB) |
RAM-only ladder (no GPU / Apple Silicon — unified memory):
| Total RAM | Chat | Context | Embedding |
|---|---|---|---|
| ≥ 64 GB | qwen3.5:27b (17 GB) | 16K | mxbai-embed-large (670 MB) |
| 32 – 64 GB | qwen3.5:9b (5.5 GB) | 16K | mxbai-embed-large (670 MB) |
| 16 – 32 GB | qwen3.5:4b (2.5 GB) | 16K | nomic-embed-text (274 MB) |
| 8 – 16 GB | qwen3.5:4b (2.5 GB) | 8K | nomic-embed-text (274 MB) |
| < 8 GB | qwen3.5:2b (1.4 GB) | 8K | granite-embedding:30m (63 MB) |
Thresholds are conservative on purpose. The ZenSearch stack (Postgres + Qdrant + RustFS + NATS + Redis + core-api + model-gw + vectorizer + parser + structure-analyzer + projector) plus Docker Desktop typically consumes 11 – 19 GB of memory before any model loads. The VRAM thresholds are also slightly below nominal GiB values (e.g. 16,000 MiB for the "16 GB" tier rather than 16,384) because modern NVIDIA drivers reserve 1–2% of VRAM — an RTX 5070 Ti advertised as 16 GB reports ~16,303 MiB, and an exact 2^N threshold would silently disqualify it.
You can override the picks at install time:
LLM_PROVIDER=ollama \
LLM_CHAT_MODEL=qwen3.5:9b \
LLM_EMBED_MODEL=nomic-embed-text \
./install.sh --yes
Or post-install by editing LLM_CHAT_MODEL / LLM_EMBED_MODEL in .env and restarting.
Custom zensearch-chat Ollama tag
Ollama's default num_ctx is 4096 tokens regardless of the model's actual capability — too small for the agent path, where the system prompt + tool definitions + conversation history routinely exceed 4K within a few turns. Without intervention, the agent silently truncates older context.
The installer creates a custom Ollama tag named zensearch-chat that wraps the picked base model with a tier-appropriate num_ctx (8K / 16K / 32K). LLM_CHAT_MODEL and LLM_AGENT_MODEL in .env are written to point at this tag, not the base model. Running ollama run qwen3.5:9b directly outside ZenSearch still gets the 4K default — only the wrapping tag has the bumped context. This avoids inflating the KV cache for unrelated models the user may have pulled.
Re-running the installer detects an existing zensearch-chat tag and:
- Skips recreation when both the
FROMbase model andnum_ctxalready match (idempotent — the common case for a--updaterun). - Warns and recreates when either differs. A common trigger: GPU upgrade lands you in a different VRAM tier whose
CONTEXT_TOKENShappens to match but whose base model changed (e.g. 8 GB → 12 GB both pick 16K but the chat model changes fromqwen3.5:4btoqwen3.5:9b). - Skips entirely when
LLM_BASE_URLpoints at a remote Ollama host (anything other thanhost.docker.internal/localhost/127.0.0.1). Localollama createruns against the local CLI's default and would put the tag on the wrong server. The installer prints a manual recipe to run on the remote host:
# On the remote Ollama host
printf 'FROM qwen3.5:9b\nPARAMETER num_ctx 32768\n' | ollama create zensearch-chat -f -
# Then in your ZenSearch .env
LLM_CHAT_MODEL=zensearch-chat
LLM_AGENT_MODEL=zensearch-chat
Chat performance flags
The installer classifies your AI backend into one of three tiers and writes sensible performance defaults to .env:
| Backend class | Triggered by | CHAT_QUERY_REWRITE_ENABLED | CHAT_FOLLOWUP_SUGGESTIONS_ENABLED | CHAT_CLASSIFICATION_ENABLED |
|---|---|---|---|---|
| hosted | LLM_PROVIDER = openai / anthropic / groq / openrouter | true | true | true |
| fast_local | NVIDIA GPU with nvidia-smi OR Apple Silicon with ≥ 32 GB unified memory | true | true | true |
| slow_local | Apple Silicon < 32 GB, CPU-only Linux, Ollama without GPU | false | false | false |
Each flag gates an optional LLM call on the chat critical path:
CHAT_QUERY_REWRITE_ENABLED— rewrites conversational follow-ups ("and what about security?") into self-contained queries via a cheap LLM call before retrieval. On hosted this takes ~5s and meaningfully improves follow-up answer quality. On memory-constrained Ollama it takes 60–180s of silence before the synthesis stream even starts.CHAT_FOLLOWUP_SUGGESTIONS_ENABLED— after the main answer streams, fires a second LLM call (5s timeout) to generate 3 "ask next" chips for the UI. Pure UX polish; costs a 5–30s tail on slow backends.CHAT_CLASSIFICATION_ENABLED— runs the background classification worker that extracts department/category/topic metadata from every indexed document. On single-slot local inference, classification batches compete with user chat for the one Ollama slot. When disabled, search still works; only facet filtering gets less granular.
All three flags are independently toggleable. To re-enable a feature after install, edit the value in .env and restart with ./scripts/install.sh --update.
Apple Silicon MLX
Ollama 0.19+ ships an MLX backend that is ~2× faster than its Metal backend on Apple Silicon, but it only auto-activates on Macs with ≥ 32 GB unified memory. On smaller Macs (16 GB, 24 GB) Ollama falls back to the slower Metal path, and qwen3.5:4b chat completions routinely take 60–180s per turn. There is no environment variable to force-enable the MLX backend below the 32 GB threshold.
For best performance on < 32 GB Apple Silicon, run MLX directly instead of Ollama:
# Install mlx-lm or mlx-openai-server (Python)
uv tool install mlx-openai-server
# Pull a pre-quantized MLX weight (different format from Ollama GGUF)
# Community weights: https://huggingface.co/mlx-community
mlx-openai-server \
--model mlx-community/Qwen3-4B-Instruct-4bit \
--port 8080 \
--max-tokens 8192 \
--context-size 32768
Then configure ZenSearch to talk to the MLX endpoint:
# In .env
LLM_PROVIDER=custom
LLM_BASE_URL=http://host.docker.internal:8080/v1
LLM_CHAT_MODEL=Qwen3-4B-Instruct-4bit
# Keep Ollama (or another embedding server) for embeddings — MLX server
# embedding support varies by implementation; validate before relying on it.
LLM_EMBED_PROVIDER=ollama
LLM_EMBED_BASE_URL=http://host.docker.internal:11434
LLM_EMBED_MODEL=nomic-embed-text
Recommended MLX servers (all expose OpenAI-compatible /v1/chat/completions):
- mlx-openai-server — qwen3 tool-call parser included; works with ZenSearch agent mode
- vllm-mlx — continuous batching, claims 400+ tok/s on Apple Silicon
mlx_lm.server— the official reference implementation; simpler but fewer tool-call parsers
Tool-calling reliability on MLX follows the same model-family rules as Ollama: qwen3 / qwen3.5 work, gemma3 is broken (upstream parser bugs — mlx-lm#1096, ollama#14493). gemma4 may have fixed the template bugs but is unverified against the ZenSearch agent tool set, so qwen3.5 stays the safe default. Stick with qwen variants for agent mode regardless of backend.
ZenSearch is an enterprise platform. Running it on sub-32 GB Apple Silicon is supported for evaluation but is not the target deployment. Production workloads should use either a hosted LLM provider, an NVIDIA GPU host, or Apple Silicon with ≥ 32 GB unified memory (where Ollama's MLX backend auto-activates).
Manual Install
If you prefer not to pipe to bash:
curl -fsSL https://releases.zensearch.ai/developer-edition/latest \
-o zensearch-dev.tar.gz
tar xzf zensearch-dev.tar.gz
cd zensearch-dev-edition-*
cp .env.lite.example .env
# Edit .env — set LLM_PROVIDER and either LLM_API_KEY (for hosted
# providers like openai/anthropic/groq) or LLM_BASE_URL (for ollama/
# lmstudio/custom). See .env.lite.example for the full option list.
./start.sh
Management
./start.sh --down # Stop all services
./start.sh --update # Pull latest images and restart
./diagnose.sh # Health check and diagnostics
What's Included
| Component | Description |
|---|---|
| Core API | REST API, search, chat, agents |
| Model Gateway | AI model proxy |
| Web UI | React frontend |
| Parser (Lite) | Lightweight document parsing (PDF, DOCX, PPTX, XLSX — no OCR) |
| Structure Analyzer (Base) | Document structure extraction |
| Projector | Projection generation |
| Vectorizer | Embedding generation |
| PostgreSQL | Database |
| Redis | Cache |
| Qdrant | Vector search |
| RustFS | Object storage |
| NATS | Message broker |
| S3 Collector | Amazon S3 / S3-compatible storage connector |
| Web Crawler | Website crawling with headless Chrome |
GPU Acceleration
If you have an NVIDIA GPU with the NVIDIA Container Toolkit installed, the installer automatically detects it and uses GPU-accelerated document parsing. No manual configuration needed.
Configuration
Resource limits and other settings can be tuned in your .env file:
| Variable | Default | Description |
|---|---|---|
PARSER_MEMORY_LIMIT | 4G | Memory limit for the document parser. Increase for large PDFs on CPU |
Not Included (Enterprise Features)
- Full parser with OCR/GPU support (lite edition uses lightweight CPU parsing — see Parser backends)
- Structure-Analyzer
[full]extra (sentence-transformers + tree-sitter for advanced semantic extraction) - Reranker, sparse embedder
- Additional data source connectors (Confluence, Slack, GitHub, Jira, Notion, Google Drive, SharePoint, Azure Blob, Salesforce, SAP, HubSpot)
- Monitoring stack (Prometheus, Grafana)
- SAML authentication (OIDC is supported in lite — see Authentication)
- Kubernetes / Helm chart deployment
- Dedicated support
The full edition's ML services (full parser with PyTorch/Docling, reranker, sparse embedder, and the structure-analyzer[full] extra) are published as linux/amd64 images only — they depend on PyTorch wheels that aren't available for arm64. If you need to run the enterprise features, use an amd64 host. The Developer/lite edition runs on both amd64 and arm64.
Enabling Authentication
By default, the Developer Edition runs with AUTH_MODE=none — no login required, all users share a single dev identity. The compose file binds the Web UI and API to 127.0.0.1 by default; enable authentication before changing LITE_BIND_ADDR for a shared or network-accessible deployment.
To enable authentication, configure any OIDC-compatible identity provider (Keycloak, Auth0, Okta, Google, Azure AD, etc.):
- Edit
.envin your ZenSearch directory:
# Change auth mode
AUTH_MODE=oidc
# OIDC provider settings (example: Keycloak)
OIDC_ISSUER_URL=https://your-keycloak.example.com/realms/zensearch
OIDC_CLIENT_ID=zensearch-api
OIDC_CLIENT_SECRET=your-client-secret
- Restart:
./start.sh --down && ./start.sh
The Web UI will now show a login screen. Users are automatically provisioned on first login with their OIDC identity (email, name, roles).
Provider examples:
| Provider | Issuer URL |
|---|---|
| Keycloak | https://keycloak.example.com/realms/your-realm |
| Auth0 | https://your-tenant.auth0.com/ |
| Okta | https://your-org.okta.com/oauth2/default |
https://accounts.google.com | |
| Azure AD | https://login.microsoftonline.com/YOUR_TENANT_ID/v2.0 |
When configuring your OIDC provider, set the redirect URI to http://localhost:35173/auth/callback (or your custom domain — substitute WEB_PORT if you overrode it). The client must support the Authorization Code flow with PKCE.
Parser backends
ZenSearch ships with three parser backends. The right choice depends on the document mix you'll be ingesting and what hardware you can dedicate to parsing.
| Backend | Image size | OCR | Image extraction | GPU | Best for |
|---|---|---|---|---|---|
Lite (PARSER_PARSER_BACKEND=lite) | ~200 MB | ✗ Scanned PDFs are flagged but not OCR'd | ✗ | ✗ | Developer Edition, lightweight CPU-only deployments, evaluation, air-gapped without GPU |
Local (PARSER_PARSER_BACKEND=local) | ~5 GB | ✓ via Docling | ✓ | Optional (NVIDIA) | Production self-hosted with reasonable hardware; full feature set on your own infra |
Modal (PARSER_PARSER_BACKEND=modal) | n/a (serverless) | ✓ | ✓ | ✓ (GPU) | Cloud deployments and hybrid setups where you don't want to operate the parsing infra |
Document-type support across backends:
| Format | Lite | Local | Modal |
|---|---|---|---|
| PDF (text-based) | ✓ pymupdf4llm | ✓ Docling | ✓ Docling |
| PDF (scanned) | ✗ flagged needs_ocr | ✓ OCR | ✓ OCR |
| DOCX | ✓ python-docx | ✓ Docling | ✓ Docling |
| PPTX | ✓ python-pptx | ✓ Docling | ✓ Docling |
| XLSX | ✓ openpyxl | ✓ Docling | ✓ Docling |
| Plain text / Markdown / HTML | ✓ shared fast path | ✓ shared fast path | ✓ shared fast path |
| Images / image-heavy PDFs | text only | ✓ vision-described | ✓ vision-described |
Picking a backend
- Just evaluating? Stay on Lite. The Developer Edition installer defaults to it.
- Self-hosted production with sub-100k documents? Local is the typical choice. Add an NVIDIA GPU if you have a lot of scanned PDFs or image-heavy slide decks.
- Hybrid or cloud-leaning deployment? Modal lets you keep the parsing burst capacity off your own hardware while keeping the rest of the stack on-prem.
To switch backend, set PARSER_PARSER_BACKEND in your .env and restart the parser container. No data migration required — the backend choice only affects future document parsing.
Vision model for image description
Image content search (see Multi-Modal Search) requires a vision-capable chat model on the back end. The default zen-mini mapping uses one in cloud. Self-hosters who pick a chat provider without vision (Groq) should set PARSER_IMAGE_DESCRIPTION_ENABLED=false to avoid noisy errors, or point the image describer at a vision-capable provider via the model gateway.
Architecture
ZenSearch deploys as a set of containerized services organized into three layers:
- Application Layer — The core platform services that handle search, chat, agents, document processing, AI model routing, and the web interface
- Infrastructure Layer — Databases, caching, object storage, and messaging used by the application services
- Monitoring Layer (optional) — Metrics, dashboards, log aggregation, and alerting
Data Source Connectors
Connectors are deployed selectively — only enable the ones for data sources your organization uses. Each connector runs independently and can be added or removed without affecting the rest of the platform.
ZenSearch supports 13 connector types: S3, GitHub, Confluence, Jira, Slack, Notion, Google Drive, SharePoint, Azure Blob, Web Crawler, Salesforce, SAP, and HubSpot.
Prerequisites
Hardware Requirements
Minimum (small team, < 10,000 documents):
- 8 CPU cores
- 16 GB RAM
- 100 GB SSD storage
- No GPU required (uses cloud LLM APIs)
Recommended (medium team, 10,000–100,000 documents):
- 16 CPU cores
- 32 GB RAM
- 500 GB SSD storage
- Optional: NVIDIA GPU for local document parsing
Large scale (100,000+ documents):
- 32+ CPU cores
- 64+ GB RAM
- 1 TB+ SSD storage
- NVIDIA GPU recommended for local parsing
- Consider running infrastructure on dedicated nodes
Software Requirements
- Docker 24.0+ with Docker Compose v2
- OpenSSL (for generating encryption keys and TLS certificates)
For Kubernetes deployments:
- Kubernetes 1.28+
- Helm 3.x
- kubectl configured for your cluster
Deployment Options
Docker Compose
Single-machine deployment for evaluation and small-to-medium teams. The platform starts with a single command and includes all application services, infrastructure, and your selected connectors.
Best for: Teams of up to ~50 users, evaluation, development, and staging environments.
Kubernetes
Production deployment with horizontal scaling, health checks, rolling updates, and high availability. Helm charts are provided.
Best for: Large teams, production workloads, organizations with existing Kubernetes infrastructure.
Air-Gapped Deployment
For environments with no internet access:
- All container images can be pre-loaded from an internet-connected staging machine
- Use a self-hosted LLM (Ollama, vLLM, or any OpenAI-compatible server) instead of cloud providers
- Pre-download ML models for local inference
- No internet access required after initial setup
Configuration
AI Models
ZenSearch supports multiple AI providers. You can configure which provider and models to use for chat, agents, and embeddings:
- Cloud providers — OpenAI, Anthropic, Cohere, Groq
- Self-hosted models — Ollama, vLLM, or any OpenAI-compatible API endpoint
- Mix and match — Use cloud models for some tasks and local models for others
Embedding models can be configured separately from chat models to optimize cost and performance.
Authentication
ZenSearch integrates with your existing identity provider:
- OIDC — Keycloak, Auth0, Okta, Azure AD, and other OIDC-compliant providers
- SAML — Enterprise SSO
- Clerk — Managed authentication service
Connectors
Deploy only the connectors your organization needs. Each connector is configured with credentials for the target data source and can be enabled, paused, or removed at any time through the dashboard.
Guardrails
Guardrails are configured per-team through the dashboard. Features include:
- Prompt injection detection
- PII detection and filtering
- Hallucination detection (lexical, semantic, hybrid)
- Toxicity filtering
- Content moderation
See the Guardrails documentation for configuration details.
Observability — Distributed Tracing
ZenSearch services are instrumented with OpenTelemetry and can export traces to any OTLP-compatible backend (Grafana Tempo, Jaeger, Honeycomb, Datadog, etc.). Enabling tracing gives you end-to-end visibility into a request as it flows through core-api, the Model Gateway, agents, and downstream providers — including per-span timing, model calls, tool invocations, and errors.
Enable on the Core API and Model Gateway:
OTEL_ENABLED=true
OTEL_EXPORTER_ENDPOINT=tempo:4318 # or your OTLP HTTP collector
OTEL_EXPORTER_TYPE=otlp # use "stdout" for local development
Both services auto-set their service.name attribute, so traces are pre-grouped in your backend. Use docker-compose.monitoring.prod.yml for a ready-made Prometheus + Grafana + Tempo stack.
Security
Encryption
- All stored credentials (API keys, OAuth tokens) are encrypted at rest
- TLS is supported for all inter-service communication in production
- Database connections support SSL/TLS
Network Security
- AI model routing is internal-only and never exposed externally
- Use a reverse proxy (Nginx, Traefik, Caddy) to terminate TLS for the web UI and API
- Configure CORS to only allow your production domain
- Internal services communicate on an isolated network
Access Control
- Role-based access control (Owner, Admin, Editor, Viewer)
- Document-level permissions synced from source platforms
- Search-time permission enforcement — users only see content they're authorized to access
GPU Support
The Developer Edition installer automatically detects NVIDIA GPUs and uses GPU-accelerated document parsing when available. This significantly speeds up document processing for large volumes.
Requirements
- NVIDIA GPU with CUDA support and 4 GB+ VRAM (8 GB recommended)
- NVIDIA drivers installed (
nvidia-smishould work) - NVIDIA Container Toolkit configured for Docker
Installing NVIDIA Container Toolkit
If the installer shows "NVIDIA GPU found but Docker NVIDIA runtime not configured", install the toolkit:
Ubuntu / Debian:
# Add NVIDIA container toolkit repository
curl -fsSL https://nvidia.github.io/libc-nvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install and configure
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
RHEL / CentOS / Fedora:
curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/rpm/nvidia-container-toolkit.repo \
| sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify GPU Access
# Should show your GPU info
docker run --rm --gpus all nvidia/cuda:12.2.2-base-ubuntu22.04 nvidia-smi
Re-run the Installer
After installing the toolkit, re-run the installer. It will detect the GPU and automatically use the GPU-accelerated parser image:
./start.sh --down
rm .env
./start.sh
How It Works
When a compatible GPU is detected, the installer:
- Writes
PARSER_GPU_ENABLED=trueto.env - Applies
docker-compose.lite.gpu.ymloverlay which switches to theparser-gpuimage - Adds NVIDIA device reservation and increases parser memory to 4 GB
- Enables a persistent model cache volume (models are downloaded on first run)
Upgrading
ZenSearch releases are delivered as updated container images. The upgrade process:
- Pull the latest images
- Database migrations run automatically on startup
- Restart services
- Verify health via the dashboard
Zero-downtime upgrades are supported on Kubernetes deployments.
Enterprise Getting Started
Enterprise customers receive:
- License key for on-premise deployment
- Private deployment guide with step-by-step instructions
- Container registry access for all platform images
- Dedicated support from our engineering team
Contact [email protected] to discuss your on-premise deployment requirements.