Skip to main content

Self-Hosting Guide

Deploy ZenSearch on your own infrastructure with full control over your data, AI models, and network configuration. The free Developer Edition is available for evaluation and small teams. Enterprise On-Premise deployment is available for production workloads.

Developer Edition

The Developer Edition is a free, self-contained Docker Compose deployment — no license key required. Jump to quickstart.

For production deployments with Kubernetes, SSO, and dedicated support, contact [email protected].

Overview

ZenSearch's on-premise deployment gives you:

  • Full data sovereignty — your data never leaves your network
  • Air-gapped support — deploy without internet connectivity
  • Bring your own LLM — use OpenAI, Anthropic, Cohere, Ollama, or any OpenAI-compatible endpoint
  • Infrastructure control — deploy on AWS, GCP, Azure, or bare metal
  • Custom networking — configure VPCs, firewalls, and service mesh as needed

Developer Edition Quickstart

The Developer Edition is a free, self-contained Docker Compose deployment for evaluation and development.

Prerequisites

  • Docker Desktop (or Docker Engine + Docker Compose v2)
  • 8 GB RAM available for Docker
  • An OpenAI API key

Install

curl -fsSL https://releases.zensearch.ai/install.sh | bash

The installer downloads the latest release, prompts for your OpenAI API key, and starts all services.

After a few minutes: Web UI at http://localhost:5173, API health at http://localhost:8080/health.

Manual Install

If you prefer not to pipe to bash:

curl -fsSL https://releases.zensearch.ai/developer-edition/latest \
-o zensearch-dev.tar.gz
tar xzf zensearch-dev.tar.gz
cd zensearch-dev-edition-*
cp .env.lite.example .env
# Edit .env and set OPENAI_API_KEY
./start.sh

Management

./start.sh --down         # Stop all services
./start.sh --update # Pull latest images and restart
./diagnose.sh # Health check and diagnostics

What's Included

ComponentDescription
Core APIREST API, search, chat, agents
Model GatewayAI model proxy
Web UIReact frontend
ParserDocument parsing
Structure AnalyzerDocument structure extraction
ProjectorProjection generation
VectorizerEmbedding generation
PostgreSQLDatabase
RedisCache
QdrantVector search
MinIOObject storage
NATSMessage broker
S3 CollectorAmazon S3 / S3-compatible storage connector
Web CrawlerWebsite crawling with headless Chrome

GPU Acceleration

If you have an NVIDIA GPU with the NVIDIA Container Toolkit installed, the installer automatically detects it and uses GPU-accelerated document parsing. No manual configuration needed.

Configuration

Resource limits and other settings can be tuned in your .env file:

VariableDefaultDescription
PARSER_MEMORY_LIMIT4GMemory limit for the document parser. Increase for large PDFs on CPU

Not Included (Enterprise Features)

  • Reranker, sparse embedder
  • Additional data source connectors (Confluence, Slack, GitHub, Jira, Notion, Google Drive, SharePoint, Azure Blob, Salesforce, SAP, HubSpot)
  • Monitoring stack (Prometheus, Grafana)
  • SAML authentication (OIDC is supported — see Enabling Authentication)
  • Kubernetes / Helm chart deployment
  • Dedicated support

Enabling Authentication

By default, the Developer Edition runs with AUTH_MODE=none — no login required, all users share a single dev identity. This is convenient for evaluation but not suitable for shared or network-accessible deployments.

To enable authentication, configure any OIDC-compatible identity provider (Keycloak, Auth0, Okta, Google, Azure AD, etc.):

  1. Edit .env in your ZenSearch directory:
# Change auth mode
AUTH_MODE=oidc

# OIDC provider settings (example: Keycloak)
OIDC_ISSUER_URL=https://your-keycloak.example.com/realms/zensearch
OIDC_CLIENT_ID=zensearch-api
OIDC_CLIENT_SECRET=your-client-secret
  1. Restart:
./start.sh --down && ./start.sh

The Web UI will now show a login screen. Users are automatically provisioned on first login with their OIDC identity (email, name, roles).

Provider examples:

ProviderIssuer URL
Keycloakhttps://keycloak.example.com/realms/your-realm
Auth0https://your-tenant.auth0.com/
Oktahttps://your-org.okta.com/oauth2/default
Googlehttps://accounts.google.com
Azure ADhttps://login.microsoftonline.com/{tenant-id}/v2.0
caution

When configuring your OIDC provider, set the redirect URI to http://localhost:5173/auth/callback (or your custom domain). The client must support the Authorization Code flow with PKCE.

Architecture

ZenSearch deploys as a set of containerized services organized into three layers:

  • Application Layer — The core platform services that handle search, chat, agents, document processing, AI model routing, and the web interface
  • Infrastructure Layer — Databases, caching, object storage, and messaging used by the application services
  • Monitoring Layer (optional) — Metrics, dashboards, log aggregation, and alerting

Data Source Connectors

Connectors are deployed selectively — only enable the ones for data sources your organization uses. Each connector runs independently and can be added or removed without affecting the rest of the platform.

ZenSearch supports 13 connector types: S3, GitHub, Confluence, Jira, Slack, Notion, Google Drive, SharePoint, Azure Blob, Web Crawler, Salesforce, SAP, and HubSpot.

Prerequisites

Hardware Requirements

Minimum (small team, < 10,000 documents):

  • 8 CPU cores
  • 16 GB RAM
  • 100 GB SSD storage
  • No GPU required (uses cloud LLM APIs)

Recommended (medium team, 10,000–100,000 documents):

  • 16 CPU cores
  • 32 GB RAM
  • 500 GB SSD storage
  • Optional: NVIDIA GPU for local document parsing

Large scale (100,000+ documents):

  • 32+ CPU cores
  • 64+ GB RAM
  • 1 TB+ SSD storage
  • NVIDIA GPU recommended for local parsing
  • Consider running infrastructure on dedicated nodes

Software Requirements

  • Docker 24.0+ with Docker Compose v2
  • OpenSSL (for generating encryption keys and TLS certificates)

For Kubernetes deployments:

  • Kubernetes 1.28+
  • Helm 3.x
  • kubectl configured for your cluster

Deployment Options

Docker Compose

Single-machine deployment for evaluation and small-to-medium teams. The platform starts with a single command and includes all application services, infrastructure, and your selected connectors.

Best for: Teams of up to ~50 users, evaluation, development, and staging environments.

Kubernetes

Production deployment with horizontal scaling, health checks, rolling updates, and high availability. Helm charts are provided.

Best for: Large teams, production workloads, organizations with existing Kubernetes infrastructure.

Air-Gapped Deployment

For environments with no internet access:

  1. All container images can be pre-loaded from an internet-connected staging machine
  2. Use a self-hosted LLM (Ollama, vLLM, or any OpenAI-compatible server) instead of cloud providers
  3. Pre-download ML models for local inference
  4. No internet access required after initial setup

Configuration

AI Models

ZenSearch supports multiple AI providers. You can configure which provider and models to use for chat, agents, and embeddings:

  • Cloud providers — OpenAI, Anthropic, Cohere, Groq
  • Self-hosted models — Ollama, vLLM, or any OpenAI-compatible API endpoint
  • Mix and match — Use cloud models for some tasks and local models for others

Embedding models can be configured separately from chat models to optimize cost and performance.

Authentication

ZenSearch integrates with your existing identity provider:

  • OIDC — Keycloak, Auth0, Okta, Azure AD, and other OIDC-compliant providers
  • SAML — Enterprise SSO
  • Clerk — Managed authentication service

Connectors

Deploy only the connectors your organization needs. Each connector is configured with credentials for the target data source and can be enabled, paused, or removed at any time through the dashboard.

Guardrails

Guardrails are configured per-team through the dashboard. Features include:

  • Prompt injection detection
  • PII detection and filtering
  • Hallucination detection (lexical, semantic, hybrid)
  • Toxicity filtering
  • Content moderation

See the Guardrails documentation for configuration details.

Security

Encryption

  • All stored credentials (API keys, OAuth tokens) are encrypted at rest
  • TLS is supported for all inter-service communication in production
  • Database connections support SSL/TLS

Network Security

  • AI model routing is internal-only and never exposed externally
  • Use a reverse proxy (Nginx, Traefik, Caddy) to terminate TLS for the web UI and API
  • Configure CORS to only allow your production domain
  • Internal services communicate on an isolated network

Access Control

  • Role-based access control (Owner, Admin, Editor, Viewer)
  • Document-level permissions synced from source platforms
  • Search-time permission enforcement — users only see content they're authorized to access

GPU Support

The Developer Edition installer automatically detects NVIDIA GPUs and uses GPU-accelerated document parsing when available. This significantly speeds up document processing for large volumes.

Requirements

  • NVIDIA GPU with CUDA support and 4 GB+ VRAM (8 GB recommended)
  • NVIDIA drivers installed (nvidia-smi should work)
  • NVIDIA Container Toolkit configured for Docker

Installing NVIDIA Container Toolkit

If the installer shows "NVIDIA GPU found but Docker NVIDIA runtime not configured", install the toolkit:

Ubuntu / Debian:

# Add NVIDIA container toolkit repository
curl -fsSL https://nvidia.github.io/libc-nvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install and configure
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

RHEL / CentOS / Fedora:

curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/rpm/nvidia-container-toolkit.repo \
| sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify GPU Access

# Should show your GPU info
docker run --rm --gpus all nvidia/cuda:12.2.2-base-ubuntu22.04 nvidia-smi

Re-run the Installer

After installing the toolkit, re-run the installer. It will detect the GPU and automatically use the GPU-accelerated parser image:

./start.sh --down
rm .env
./start.sh

How It Works

When a compatible GPU is detected, the installer:

  • Writes PARSER_GPU_ENABLED=true to .env
  • Applies docker-compose.lite.gpu.yml overlay which switches to the parser-gpu image
  • Adds NVIDIA device reservation and increases parser memory to 4 GB
  • Enables a persistent model cache volume (models are downloaded on first run)

Upgrading

ZenSearch releases are delivered as updated container images. The upgrade process:

  1. Pull the latest images
  2. Database migrations run automatically on startup
  3. Restart services
  4. Verify health via the dashboard

Zero-downtime upgrades are supported on Kubernetes deployments.

Enterprise Getting Started

Enterprise customers receive:

  • License key for on-premise deployment
  • Private deployment guide with step-by-step instructions
  • Container registry access for all platform images
  • Dedicated support from our engineering team

Contact [email protected] to discuss your on-premise deployment requirements.