Self-Hosting Guide

Deploy ZenSearch on your own infrastructure with full control over your data, AI models, and network configuration. The free Developer Edition is available for evaluation and small teams. Enterprise On-Premise deployment is available for production workloads.

Developer Edition

The Developer Edition is a free, self-contained Docker Compose deployment — no license key required. Jump to quickstart.

For production deployments with Kubernetes, SSO, and dedicated support, contact [email protected].

Overview

ZenSearch's on-premise deployment gives you:

Full data sovereignty — your data never leaves your network
Air-gapped support — deploy without internet connectivity
Bring your own LLM — use OpenAI, Anthropic, Cohere, Ollama, or any OpenAI-compatible endpoint
Infrastructure control — deploy on AWS, GCP, Azure, or bare metal
Custom networking — configure VPCs, firewalls, and service mesh as needed

Developer Edition Quickstart

The Developer Edition is a free, self-contained Docker Compose deployment for evaluation and development.

Prerequisites

Docker Desktop (or Docker Engine + Docker Compose v2)
8 GB RAM available for Docker
An OpenAI API key

Install

curl -fsSL https://releases.zensearch.ai/install.sh | bash

The installer downloads the latest release, prompts for your OpenAI API key, and starts all services.

After a few minutes: Web UI at http://localhost:5173, API health at http://localhost:8080/health.

Manual Install

If you prefer not to pipe to bash:

curl -fsSL https://releases.zensearch.ai/developer-edition/latest \
  -o zensearch-dev.tar.gz
tar xzf zensearch-dev.tar.gz
cd zensearch-dev-edition-*
cp .env.lite.example .env
# Edit .env and set OPENAI_API_KEY
./start.sh

Management

./start.sh --down         # Stop all services
./start.sh --update       # Pull latest images and restart
./diagnose.sh             # Health check and diagnostics

What's Included

Component	Description
Core API	REST API, search, chat, agents
Model Gateway	AI model proxy
Web UI	React frontend
Parser	Document parsing
Structure Analyzer	Document structure extraction
Projector	Projection generation
Vectorizer	Embedding generation
PostgreSQL	Database
Redis	Cache
Qdrant	Vector search
MinIO	Object storage
NATS	Message broker
S3 Collector	Amazon S3 / S3-compatible storage connector
Web Crawler	Website crawling with headless Chrome

GPU Acceleration

If you have an NVIDIA GPU with the NVIDIA Container Toolkit installed, the installer automatically detects it and uses GPU-accelerated document parsing. No manual configuration needed.

Configuration

Resource limits and other settings can be tuned in your .env file:

Variable	Default	Description
`PARSER_MEMORY_LIMIT`	`4G`	Memory limit for the document parser. Increase for large PDFs on CPU

Not Included (Enterprise Features)

Reranker, sparse embedder
Additional data source connectors (Confluence, Slack, GitHub, Jira, Notion, Google Drive, SharePoint, Azure Blob, Salesforce, SAP, HubSpot)
Monitoring stack (Prometheus, Grafana)
SAML authentication (OIDC is supported — see Enabling Authentication)
Kubernetes / Helm chart deployment
Dedicated support

Enabling Authentication

By default, the Developer Edition runs with AUTH_MODE=none — no login required, all users share a single dev identity. This is convenient for evaluation but not suitable for shared or network-accessible deployments.

To enable authentication, configure any OIDC-compatible identity provider (Keycloak, Auth0, Okta, Google, Azure AD, etc.):

Edit .env in your ZenSearch directory:

# Change auth mode
AUTH_MODE=oidc

# OIDC provider settings (example: Keycloak)
OIDC_ISSUER_URL=https://your-keycloak.example.com/realms/zensearch
OIDC_CLIENT_ID=zensearch-api
OIDC_CLIENT_SECRET=your-client-secret

Restart:

./start.sh --down && ./start.sh

The Web UI will now show a login screen. Users are automatically provisioned on first login with their OIDC identity (email, name, roles).

Provider examples:

Provider	Issuer URL
Keycloak	`https://keycloak.example.com/realms/your-realm`
Auth0	`https://your-tenant.auth0.com/`
Okta	`https://your-org.okta.com/oauth2/default`
Google	`https://accounts.google.com`
Azure AD	`https://login.microsoftonline.com/{tenant-id}/v2.0`

caution

When configuring your OIDC provider, set the redirect URI to http://localhost:5173/auth/callback (or your custom domain). The client must support the Authorization Code flow with PKCE.

Architecture

ZenSearch deploys as a set of containerized services organized into three layers:

Application Layer — The core platform services that handle search, chat, agents, document processing, AI model routing, and the web interface
Infrastructure Layer — Databases, caching, object storage, and messaging used by the application services
Monitoring Layer (optional) — Metrics, dashboards, log aggregation, and alerting

Data Source Connectors

Connectors are deployed selectively — only enable the ones for data sources your organization uses. Each connector runs independently and can be added or removed without affecting the rest of the platform.

ZenSearch supports 13 connector types: S3, GitHub, Confluence, Jira, Slack, Notion, Google Drive, SharePoint, Azure Blob, Web Crawler, Salesforce, SAP, and HubSpot.

Prerequisites

Hardware Requirements

Minimum (small team, < 10,000 documents):

8 CPU cores
16 GB RAM
100 GB SSD storage
No GPU required (uses cloud LLM APIs)

Recommended (medium team, 10,000–100,000 documents):

16 CPU cores
32 GB RAM
500 GB SSD storage
Optional: NVIDIA GPU for local document parsing

Large scale (100,000+ documents):

32+ CPU cores
64+ GB RAM
1 TB+ SSD storage
NVIDIA GPU recommended for local parsing
Consider running infrastructure on dedicated nodes

Software Requirements

Docker 24.0+ with Docker Compose v2
OpenSSL (for generating encryption keys and TLS certificates)

For Kubernetes deployments:

Kubernetes 1.28+
Helm 3.x
kubectl configured for your cluster

Deployment Options

Docker Compose

Single-machine deployment for evaluation and small-to-medium teams. The platform starts with a single command and includes all application services, infrastructure, and your selected connectors.

Best for: Teams of up to ~50 users, evaluation, development, and staging environments.

Kubernetes

Production deployment with horizontal scaling, health checks, rolling updates, and high availability. Helm charts are provided.

Best for: Large teams, production workloads, organizations with existing Kubernetes infrastructure.

Air-Gapped Deployment

For environments with no internet access:

All container images can be pre-loaded from an internet-connected staging machine
Use a self-hosted LLM (Ollama, vLLM, or any OpenAI-compatible server) instead of cloud providers
Pre-download ML models for local inference
No internet access required after initial setup

Configuration

AI Models

ZenSearch supports multiple AI providers. You can configure which provider and models to use for chat, agents, and embeddings:

Cloud providers — OpenAI, Anthropic, Cohere, Groq
Self-hosted models — Ollama, vLLM, or any OpenAI-compatible API endpoint
Mix and match — Use cloud models for some tasks and local models for others

Embedding models can be configured separately from chat models to optimize cost and performance.

Authentication

ZenSearch integrates with your existing identity provider:

OIDC — Keycloak, Auth0, Okta, Azure AD, and other OIDC-compliant providers
SAML — Enterprise SSO
Clerk — Managed authentication service

Connectors

Deploy only the connectors your organization needs. Each connector is configured with credentials for the target data source and can be enabled, paused, or removed at any time through the dashboard.

Guardrails

Guardrails are configured per-team through the dashboard. Features include:

Prompt injection detection
PII detection and filtering
Hallucination detection (lexical, semantic, hybrid)
Toxicity filtering
Content moderation

See the Guardrails documentation for configuration details.

Security

Encryption

All stored credentials (API keys, OAuth tokens) are encrypted at rest
TLS is supported for all inter-service communication in production
Database connections support SSL/TLS

Network Security

AI model routing is internal-only and never exposed externally
Use a reverse proxy (Nginx, Traefik, Caddy) to terminate TLS for the web UI and API
Configure CORS to only allow your production domain
Internal services communicate on an isolated network

Access Control

Role-based access control (Owner, Admin, Editor, Viewer)
Document-level permissions synced from source platforms
Search-time permission enforcement — users only see content they're authorized to access

GPU Support

The Developer Edition installer automatically detects NVIDIA GPUs and uses GPU-accelerated document parsing when available. This significantly speeds up document processing for large volumes.

Requirements

NVIDIA GPU with CUDA support and 4 GB+ VRAM (8 GB recommended)
NVIDIA drivers installed (nvidia-smi should work)
NVIDIA Container Toolkit configured for Docker

Installing NVIDIA Container Toolkit

If the installer shows "NVIDIA GPU found but Docker NVIDIA runtime not configured", install the toolkit:

Ubuntu / Debian:

# Add NVIDIA container toolkit repository
curl -fsSL https://nvidia.github.io/libc-nvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install and configure
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

RHEL / CentOS / Fedora:

curl -s -L https://nvidia.github.io/libc-nvidia-container/stable/rpm/nvidia-container-toolkit.repo \
  | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify GPU Access

# Should show your GPU info
docker run --rm --gpus all nvidia/cuda:12.2.2-base-ubuntu22.04 nvidia-smi

Re-run the Installer

After installing the toolkit, re-run the installer. It will detect the GPU and automatically use the GPU-accelerated parser image:

./start.sh --down
rm .env
./start.sh

How It Works

When a compatible GPU is detected, the installer:

Writes PARSER_GPU_ENABLED=true to .env
Applies docker-compose.lite.gpu.yml overlay which switches to the parser-gpu image
Adds NVIDIA device reservation and increases parser memory to 4 GB
Enables a persistent model cache volume (models are downloaded on first run)

Upgrading

ZenSearch releases are delivered as updated container images. The upgrade process:

Pull the latest images
Database migrations run automatically on startup
Restart services
Verify health via the dashboard

Zero-downtime upgrades are supported on Kubernetes deployments.

Enterprise Getting Started

Enterprise customers receive:

License key for on-premise deployment
Private deployment guide with step-by-step instructions
Container registry access for all platform images
Dedicated support from our engineering team

Contact [email protected] to discuss your on-premise deployment requirements.

Overview​

Developer Edition Quickstart​

Prerequisites​

Install​

Manual Install​

Management​

What's Included​

GPU Acceleration​

Configuration​

Not Included (Enterprise Features)​

Enabling Authentication​

Architecture​

Data Source Connectors​

Prerequisites​

Hardware Requirements​

Software Requirements​

Deployment Options​

Docker Compose​

Kubernetes​

Air-Gapped Deployment​

Configuration​

AI Models​

Authentication​

Connectors​

Guardrails​

Security​

Encryption​

Network Security​

Access Control​

GPU Support​

Requirements​

Installing NVIDIA Container Toolkit​

Verify GPU Access​

Re-run the Installer​

How It Works​

Upgrading​

Enterprise Getting Started​

Overview

Developer Edition Quickstart

Prerequisites

Install

Manual Install

Management

What's Included

GPU Acceleration

Configuration

Not Included (Enterprise Features)

Enabling Authentication

Architecture

Data Source Connectors

Prerequisites

Hardware Requirements

Software Requirements

Deployment Options

Docker Compose

Kubernetes

Air-Gapped Deployment

Configuration

AI Models

Authentication

Connectors

Guardrails

Security

Encryption

Network Security

Access Control

GPU Support

Requirements

Installing NVIDIA Container Toolkit

Verify GPU Access

Re-run the Installer

How It Works

Upgrading

Enterprise Getting Started