AI Agents
Agents are AI-powered assistants that can perform complex, multi-step tasks using a suite of tools. Create custom agents tailored to your specific workflows and research needs.
Overview
ZenSearch agents can:
- Execute multi-step research tasks autonomously
- Use specialized tools for search, database queries, and analysis
- Maintain context across conversation turns
- Provide comprehensive, well-researched answers
Agent Page
Accessing Agents
- Click Agents in the left sidebar
- Browse available templates and your custom agents
- Create new agents or edit existing ones
Interface Sections
Templates
Pre-built agent configurations for common use cases:
- View template details
- See enabled tools
- Preview system prompts
- Click Use Template to create an agent
My Agents
Your custom agent instances:
- Edit agent configuration
- Delete agents
- Start a chat session
- View tool configurations
Creating an Agent
Step 1: Start New Agent
Click Create Agent or Use Template to begin.
Step 2: Configure Basics
| Field | Description | Required |
|---|---|---|
| Name | Display name for the agent | Yes |
| Icon | Visual identifier (9 options) | Yes |
| Description | Brief summary of capabilities | No |
| System Prompt | Instructions defining agent behavior | Yes |
| Start Message | Initial greeting or prompt | No |
Step 3: Select Tools
Choose which tools the agent can use:
Document Tools
| Tool | Description |
|---|---|
search_documents | Search across collections |
get_document | Retrieve full document content |
summarize_document | Generate document summaries |
Database Tools
| Tool | Description |
|---|---|
search_database_schema | Discover database structure |
query_database | Execute read-only SQL queries |
get_table_info | Get table columns and types |
Knowledge Graph Tools
| Tool | Description |
|---|---|
get_document_entities | Extract entities from documents |
search_knowledge_graph | Find entity relationships |
get_entity_details | Get detailed entity information |
Memory Tools
| Tool | Description |
|---|---|
recall_memory | Retrieve previously saved knowledge during execution. Searches across memory types: fact, preference, insight, procedure, summary |
save_memory | Store important information discovered during execution for future recall. Assign a memory type and importance score |
search_sessions | Search across the user's past conversations by keyword. Returns matching messages grouped by conversation with titles and timestamps so the agent can pick up where a previous session left off |
view_procedure | Fetch the full step-by-step body of a learned procedure listed in the system prompt's "Available Procedures" section. Returns ordered steps, tool sequence, common pitfalls, and verification criteria |
Memory tools allow agents to build persistent knowledge over time. For example, an agent can save a user's preferred report format as a preference, then recall it in future conversations without the user repeating themselves. search_sessions lets the agent answer "what did we decide last week about X?" by fetching the actual conversation history rather than relying on recall alone.
Memories are periodically consolidated: near-duplicate entries are merged, recent and frequently-accessed items are preferred, and memories are organized by scope (user, team, agent) so shared knowledge doesn't bleed between tenants. Consolidation runs as a background job by default and can be disabled with MEMORY_CONSOLIDATION_ENABLED=false if you want to inspect raw extraction history.
Utility Tools
| Tool | Description |
|---|---|
calculate | Perform mathematical calculations |
get_datetime | Get current date and time |
Step 4: Set Knowledge Scope
Optionally filter which collections the agent can access:
- Click Knowledge Base dropdown
- Select specific collections
- Leave empty for all collections
Step 5: Save
Click Save to create your agent.
Writing System Prompts
The system prompt defines your agent's personality and behavior.
Best Practices
-
Define the role clearly
You are a technical documentation assistant specializingin helping developers understand our codebase. -
Specify capabilities
You can search documentation, retrieve code examples,and explain technical concepts. -
Set boundaries
Focus only on technical questions. For HR or policyquestions, direct users to the appropriate resources. -
Define response style
Provide concise, accurate answers with code exampleswhen relevant. Always cite your sources.
Example System Prompts
Research Assistant
You are a research assistant helping users find and
synthesize information from company documents.
Capabilities:
- Search across all connected data sources
- Summarize long documents
- Compare information from multiple sources
- Provide cited answers
Guidelines:
- Always cite your sources with specific document names
- If information conflicts, highlight the discrepancy
- Ask clarifying questions when queries are ambiguous
- Provide confidence levels for your answers
Sales Intelligence Agent
You are a sales intelligence agent helping the sales team
understand prospects, deals, and market trends.
You have access to:
- Salesforce CRM data
- HubSpot marketing data
- Company presentations and proposals
When answering questions:
- Provide specific data points with dates
- Compare current metrics to historical data
- Identify trends and patterns
- Suggest actionable insights
Code Expert
You are a code expert for our engineering team.
Capabilities:
- Search code repositories
- Explain complex code patterns
- Find usage examples
- Identify dependencies
Guidelines:
- Include code snippets in your responses
- Link to relevant documentation
- Explain the "why" behind code decisions
- Suggest best practices when appropriate
Using Agents
Starting a Chat
- Go to Agents page
- Click Chat on any agent
- You'll be taken to Ask page with agent activated
Agent Indicator
When an agent is active, you'll see:
- Agent banner with name and icon
- Agent badge on messages
- Different response behavior
Switching Agents
- Click the agent selector on the Ask page
- Choose a different agent or disable
- Agent context switches immediately
Agent Execution
How Agents Work
- Receive Query: Agent receives your question
- Plan (if enabled): Creates a strategy
- Execute: Runs tools iteratively
- Synthesize: Combines results into answer
Execution Flow
Query → Planning → Tool Execution → Synthesis → Response
↓ ↓
Strategy Iteration Loop
(up to max_iterations)
Parallel Tool Execution
When an agent needs to call multiple tools that don't depend on each other, it executes them concurrently rather than sequentially. This significantly reduces response time for multi-tool tasks.
For example, if an agent needs to search two different collections and query a database, all three calls run at the same time instead of one after another.
Reasoning Trace
Each agent step includes a collapsible "thinking" block that shows the agent's internal reasoning. This transparency lets you understand why the agent chose a particular tool, what it expected to find, and how it interpreted results.
Reasoning traces are visible in the progress display:
[Iteration 1/5]
├── Thinking: The user wants Q4 revenue broken down by region.
│ I'll search financial reports first, then query the sales
│ database for exact figures.
├── Calling: search_documents
│ Query: "Q4 2024 revenue by region"
└── Result: Found 6 documents
Click any thinking block to expand or collapse it. Traces are preserved in conversation history for later review.
Progressive Retrieval
When an agent detects that its initial search results have low confidence or insufficient coverage, it autonomously fetches additional context. The agent re-queries with refined terms, broader scope, or alternative phrasings until it reaches a satisfactory confidence level or exhausts its iteration budget.
This means the agent self-corrects rather than producing a low-quality answer from limited sources.
Error Recovery
When a tool call fails due to a transient error (timeout, rate limit, upstream service error), the agent automatically retries the call once before reporting the failure. If the retry succeeds, execution continues seamlessly. If it fails, the agent receives an enriched error message explaining that an automatic retry was already attempted, helping it decide whether to try a different approach.
Recovery is transparent — automatic retries do not count against the tool call limit, and successful recoveries are logged in the reasoning trace for full visibility.
Enabled by default via AGENT_RECOVERY_ENABLED=true. Set to false if you prefer transient failures to surface immediately for debugging.
Context Compaction
For long-running conversations with many tool calls, the agent's message history can grow large. Context compaction summarizes older messages into a compact summary that preserves:
- Search queries that were executed
- Key findings from tool results
- Names of tools that were used
- The last user request before compaction
Recent messages are always kept verbatim. When compaction runs multiple times in a conversation, prior summaries are chained so earlier context is never fully lost.
Compaction can trigger on either a message-count threshold or a token-aware threshold — when the conversation exceeds a fraction of the model's context window (default 50%), older content is pruned and summarized before it crowds out room for the answer.
Enabled by default via AGENT_COMPACTION_ENABLED=true. Tuning knobs:
| Variable | Default | Purpose |
|---|---|---|
AGENT_COMPACTION_THRESHOLD | 30 | Message count that triggers compaction (fallback when token threshold is 0) |
AGENT_COMPACTION_PRESERVE | 10 | Number of recent messages always kept verbatim |
AGENT_COMPACTION_TOKEN_THRESHOLD | 0.5 | Trigger fraction of context window (0 to disable token-aware path) |
When observational memory (below) is enabled, the LLM-generated observation prefix replaces heuristic compaction — the two compression layers are mutually exclusive so the agent never double-summarizes the same conversation segment.
Observational Memory
After every long conversation turn (AGENT_OBSERVATION_THRESHOLD, default 10000 tokens), a lightweight zen-mini call runs asynchronously in the background to extract a structured Observation from the raw messages:
- Key findings: factual statements the agent established
- Tool summary: which tools fired, with their purpose and outcome
- User intent: what the user is trying to accomplish
- Pending items: anything the agent planned but hasn't finished
Observations are persisted in the conversation_observations table. On subsequent agent runs in the same conversation, the orchestrator loads the most recent observation and injects it as a cacheable system-prompt prefix instead of replaying the raw history. This typically achieves 80%+ token compression on 50+ message conversations while keeping the prefix bytes byte-stable so provider prompt caches hit.
When observations themselves accumulate past a second threshold (AGENT_REFLECTION_THRESHOLD, default 20000 tokens), a Reflector consolidates many observations into a single higher-level "session summary" that captures recurring topics, established facts, and user patterns. Reflections take precedence over raw observations in the prefix.
The observer call is rate-limited to once per conversation per 5 minutes so even a chatty user can't drive runaway token spend, and the entire path is fail-open: if the LLM call fails or returns malformed JSON, the run falls back to deterministic compaction without surfacing the error.
Enabled by default via AGENT_OBSERVATIONAL_MEMORY_ENABLED=true. Set to false if model spend is tight or your conversations are short.
Procedural Memory
After successful multi-tool sessions (≥5 tool calls AND ≥0.7 synthesis confidence), the same async trigger that handles observations also runs a ProcedureExtractor — a separate zen-mini call that distills the workflow into a structured ProcedureMemory:
- Name (snake_case identifier)
- Trigger pattern (one sentence: "when asked to debug pipeline health")
- Steps (ordered, with optional tool name per step)
- Tool sequence (distinct tools in first-use order)
- Pitfalls (mistakes the transcript demonstrates)
- Verification (how to confirm success)
Procedures are stored in agent_memories with memory_type='procedure'. On future agent runs, the orchestrator:
- Loads every procedure for the team and renders only the name + trigger pattern in the system prompt's "Available Procedures" section (~50 tokens per procedure — flat cost regardless of how many workflows the agent has learned)
- Keyword-matches the incoming user query against each procedure's trigger pattern. If a match is found, the agent gets a one-line system-message hint suggesting it call
view_procedure(name="...")to load the full step body - The LLM decides whether to follow the suggested procedure or deviate based on the specific query
- If a matched procedure leads to a successful run (
SynthesisConfidence ≥ 0.7), itsSuccessCountis incremented post-execution. Procedures withSuccessCount ≥ 3are prefixed with a ★ in the system prompt list so the LLM prefers battle-tested workflows over new ones
This is a self-improving skill loop — agents learn reusable workflows from their own successful runs without operator intervention. Enabled by default via AGENT_PROCEDURAL_MEMORY_ENABLED=true. Requires AGENT_OBSERVATIONAL_MEMORY_ENABLED=true because the procedure extraction path piggybacks on the observation trigger's async goroutine.
Multi-Agent Delegation
When an agent encounters a subtask that's better handled by a specialist, it can delegate to another agent in the same team. Two tools enable this:
| Tool | Description |
|---|---|
discover_agents | List specialist agents available for delegation. Returns each agent's name and description so the LLM can pick the right one for the subtask |
delegate_to_agent | Hand off a subtask to a named specialist. The specialist runs in an isolated execution context with its own tools and knowledge base, and returns its answer + sources back to the parent agent |
Delegation is depth-limited at 2 levels deep (MaxDelegationDepth = 2) to prevent infinite recursion — sub-agents at the maximum depth do not get the delegation tools registered, so they can't delegate further.
Use cases:
- A general research agent delegates SQL-heavy questions to a database specialist
- A support triage agent hands off technical deep-dives to a code expert
- A briefing agent asks a CRM specialist for customer-specific data before composing the final answer
Delegation is wired automatically — the orchestrator's AgentLookup adapter pulls all active agents in the team from the AgentService.List endpoint, so any agent you create in the UI immediately becomes a valid delegation target. There is no env flag to enable; the tools register whenever an AgentLookup is configured (which is always in production wiring). Sub-agent depth, system prompt, tool list, and knowledge base are all inherited from the specialist's stored configuration.
Large Result Caching
Tool calls that return very large payloads (big search result sets, long documents, wide database query results) are persisted to a result cache rather than stuffed back into the prompt. The tool output message includes a short preview plus a handle, and the agent uses a retrieve_full_result tool to pull specific slices on demand. This lets agents work with results far larger than the context window without truncation.
Weighted Tool Budgets
Not all tool calls should count equally against the agent's budget. Read-only tools (search, get_document, query_database) are weighted lighter than write tools (create_artifact, send_email, update_ticket), so an agent can comfortably explore a topic with many searches before bumping into its tool-call limit — while write operations remain tightly capped for safety.
Tool-Level Output Guardrails
In addition to input and final-response guardrails, individual tool results can be run through guardrails before the agent sees them. This catches cases where a database query, search result, or external API response contains PII, secrets, or prohibited content and redacts or blocks it before the agent can incorporate it into its answer.
Dynamic Budget Tiers
Not every question deserves the same budget. A factoid lookup ("what was last quarter's revenue?") doesn't need the same iteration allowance as a deep comparative analysis. ZenSearch classifies each query and applies a complexity-aware tier before the first LLM call fires:
| Tier | When it fires | Iterations | Tool Calls | Tokens | Wall-clock |
|---|---|---|---|---|---|
| Factoid | Simple lookup questions | 10 | 20 | 50k | 120s |
| Procedural | How-to / step-by-step | 15 | 30 | 80k | 180s |
| Exploratory | Open-ended research (default) | 25 | 50 | 150k | 300s |
| Comparative | N-way comparison, tradeoffs | 30 | 60 | 180k | 360s |
| Automation | Scheduled / event-triggered runs | 50 | 100 | 300k | 600s |
Tiers only bump budgets upward — a deployment with a generous operator floor is never shrunk just because a query looks simple. Automation runs always get the deep-research automation tier because they're non-interactive and the answer is the deliverable.
Cost Ceiling & Live Cost Meter
Budgets protect against runaway loops; the cost ceiling protects against runaway dollars. Every agent run is gated by a dollar cap:
- Pre-flight gate: Before any LLM tokens are spent, the agent estimates the worst-case cost (max tokens × model input/output rates × the tier's iteration count). If the estimate exceeds
AGENT_MAX_COST_PER_RUN_USD(default $2.00), the run is rejected immediately. IfAGENT_MAX_COST_PER_TEAM_DAY_USDis set, the gate also sums the team's rolling 24h spend and rejects runs that would push the total over the team-daily cap. - Mid-run gate: If actual cost crosses the cap during execution, the agent finalizes a partial answer with
truncation_reason="cost"rather than burning more budget.
The chat UI renders a live cost meter next to the agent's steps (e.g. $0.43 / $2.00), updating every few iterations. Users see at a glance when they're close to the cap and can decide whether to extend the budget and continue.
Every run writes a row to the agent_cost_usage table at start (with the worst-case estimate) and updates it on completion (with actuals). The observability dashboard reads this to show actual-vs-estimate variance, pre-flight rejection rate, and per-model spend.
Pause & Resume on Budget Exhaustion
When an agent hits any soft-limit — iterations, tool calls, tokens, wall-clock, or cost — it doesn't just fail. It:
- Synthesizes a partial answer from whatever research it completed, clearly marked
"Based on partial research: …" - Saves a resumable checkpoint to Redis with the full conversation, tool history, and running budget state (7-day TTL by default)
- Renders a Continue button in the chat UI showing which budget was exhausted (e.g. "Cost limit reached — $2.00 / $2.00 used")
Clicking Continue extends the budget by a fresh config delta, preserves everything the agent has already learned, and re-enters the execution graph where it left off — no re-research, no repeated work. Users who triggered a deep-research run Friday afternoon can resume it Monday morning.
Paused runs are user-scoped: you can only resume your own runs, and cross-team access is rejected with a 404 so valid run IDs can't be probed. Authorization is double-checked both in the checkpoint key (softlimit:{teamID}:{userID}:{runID}) and in the inner state.
Iteration Limits
Agents have configurable limits that act as the safety net beneath the dynamic tiers:
| Setting | Env var | Default | Description |
|---|---|---|---|
| Max Iterations | AGENT_MAX_ITERATIONS | 25 | Floor for the tool-calling iteration budget. PlanNode may bump this upward based on the planned step count |
| Max Iterations Hard Cap | AGENT_MAX_ITERATIONS_HARD_CAP | 50 | Absolute upper bound after plan-driven bumping. Safety net for pathological plans |
| Max Tool Calls | AGENT_MAX_TOOL_CALLS | 50 | Maximum total tool invocations per request |
| Wall-clock Timeout | AGENT_TIMEOUT_SECONDS | 300 | Primary cost/safety backstop — the agent always halts within this window |
| Max Tokens Per Run | AGENT_MAX_TOKENS_PER_RUN | 150,000 | Cumulative LLM token budget across every Chat call. Exceeded → partial answer with truncation_reason="tokens" |
| Per-Tool Timeout | AGENT_PER_TOOL_TIMEOUT_SECONDS | 60 | Upper bound on a single tool call's wall-clock duration. Prevents one slow connector from starving faster siblings in multi-connector queries |
| Max Cost Per Run | AGENT_MAX_COST_PER_RUN_USD | 2.00 | Per-run dollar ceiling. Enforced both pre-flight and mid-run. Set to 0 to disable |
| Max Cost Per Team-Day | AGENT_MAX_COST_PER_TEAM_DAY_USD | 0 | Rolling 24h team-wide dollar cap. 0 disables |
| Soft-limit Pause TTL | AGENT_SOFT_LIMIT_PAUSE_TTL_HOURS | 168 | How long a paused run stays resumable (default 7 days) |
Observing Progress
During execution, you'll see real-time updates:
[Planning]
Creating a strategy to answer your question...
[Iteration 1/5]
├── Thinking: I need to find revenue data first
├── Calling: search_documents
│ Query: "quarterly revenue 2024"
└── Result: Found 6 documents
[Iteration 2/5]
├── Thinking: Now I need specific Q4 numbers
├── Calling: query_database
│ SQL: "SELECT quarter, revenue FROM sales..."
└── Result: 4 rows returned
[Synthesizing]
Combining information from 6 documents and database query...
Canvas Artifacts Creation
Agents can create canvas artifacts — persistent, versioned content objects such as reports, analyses, code files, or structured documents. When an agent produces a substantial piece of content, it can save it as an artifact that you can revisit, edit, and iterate on.
Artifacts are created automatically when the agent determines the output warrants a persistent document. You can also prompt the agent explicitly:
"Create a report summarizing our Q4 performance"
"Write an onboarding checklist for new engineers"
"Draft a project proposal based on these requirements"
See the Canvas & Artifacts page for full details on versioning, diff view, and editing.
Agent Templates
Available Templates
ZenSearch provides templates for common use cases:
| Template | Description | Tools |
|---|---|---|
| Research Assistant | General research and synthesis | search, summarize |
| Data Analyst | Database queries and analysis | database tools, calculate |
| Code Expert | Code search and explanation | search, get_document |
| Knowledge Navigator | Entity and relationship discovery | knowledge graph tools |
Customizing Templates
- Click Use Template
- Modify name, description, or prompt
- Add or remove tools
- Save as your own agent
Agent Instructions
Agent instructions provide contextual guidance that shapes how agents behave. Instructions are scoped to specific teams, collections, or individual agents, and can be filtered by user role.
How Instructions Work
Instructions are injected into the agent's system prompt based on the current context:
- Team-scoped: Apply to all agents within a team (e.g., "Always use formal language")
- Collection-scoped: Apply when the agent searches a specific collection (e.g., "Financial data is in USD unless stated otherwise")
- Agent-scoped: Apply to a single agent instance (e.g., "Focus on technical accuracy over brevity")
Role-Based Filtering
Instructions can be restricted to specific roles. For example, an instruction like "Include cost data in reports" might only apply to users with the Admin or Editor role, while Viewers see a simplified version.
Agent Automations
Automations let you trigger agents on external events without manual intervention. An automation connects a trigger (the event that starts the agent) to a delivery method (where the result goes).
Triggers
| Trigger | Description |
|---|---|
| Cron Schedule | Run on a recurring schedule (e.g., daily summary at 9 AM) |
| Slack Message | Activate when a message matches a pattern in a Slack channel |
| Webhook | Activate when an external system sends a webhook event |
| Inbound Email | Triggered by emails delivered to a configured address |
| Event Subscription | Activated by Jira / Zendesk / GitHub / Confluence / Salesforce / HubSpot events |
| Meeting | Fired when a Zoom / Google Meet / Microsoft Teams meeting completes |
Delivery Methods
| Method | Description |
|---|---|
| Webhook | POST the agent's response to an endpoint |
| Slack | Send the response to a Slack channel or thread |
| Send the response via email |
Acceptance Criteria
Automations can define quality thresholds that the agent's output must meet. When configured, the agent evaluates its own response against these criteria before completing:
| Criterion | Description |
|---|---|
| Min Confidence | Minimum synthesis confidence score (0.0–1.0) |
| Min Sources | Minimum number of source documents cited |
| Require Answer | Response must contain a non-empty answer |
The verification result (satisfied, partial, or unsatisfied) is recorded with the automation run for audit and monitoring. Each criterion includes the actual vs. expected values so you can diagnose why a run failed verification.
Stale State Detection
Before an automation runs, the system can perform pre-flight checks to verify that the data it depends on is fresh:
- KB Sync Freshness: Checks whether the knowledge base collections have been synced within a configurable time window (default: 24 hours)
- Agent Config Drift: Checks whether the agent's configuration was modified after the automation was last updated
Two policies are available:
| Policy | Behavior |
|---|---|
| Warn | Log a warning and proceed with execution |
| Block | Fail the run immediately with a descriptive reason |
This prevents automations from silently producing stale or incorrect results when underlying data sources haven't been refreshed.
Auto-Resume on Soft-Limit Pause
Long-running automations (deep-research, competitive analysis, batch ticket triage) can exhaust budgets mid-run — especially on the deep automation tier with its larger per-run allowances. Rather than failing those runs and losing the partial research, ZenSearch auto-resumes paused automation runs on the next scheduler tick.
When an automation run hits a soft-limit, it writes a resumable checkpoint (same mechanism as user-triggered pause/resume) and transitions to paused. On the next cron tick, the scheduler picks up the paused run, rehydrates its state, extends the budgets, and re-enters the execution graph where it left off. The automation's run_count isn't inflated by paused intermediates — only fully terminal runs (success / failed / timeout / abandoned) count.
Each automation can configure a max_resume_attempts in its config JSONB (default 3). Once a pause/resume chain exceeds the cap, the run transitions to abandoned and the next tick starts a fresh run from scratch instead. Chains are linked via a parent_run_id back-pointer so operators can render the full audit trail in the run list.
The status lifecycle:
pending → running → { success | failed | timeout | paused }
paused → superseded (parent marked once a child resumes)
paused → abandoned (resume_attempts exceeded)
superseded is the terminal state for a parent run once a child takes over; the child's terminal status is the real outcome of the continuation. Dashboards showing "successful runs" count tips only, not parents.
Shadow mode
Wiring a new automation against production traffic is risky — a half-tuned prompt can spam Slack, write the wrong record, or send an embarrassing email reply. Shadow mode lets you observe what an automation would do without actually delivering the result.
When shadow mode is enabled on an automation:
- The trigger still fires every time it would in production.
- The agent runs end-to-end and produces a complete result.
- Delivery is suppressed — Slack messages aren't posted, emails aren't sent, webhooks aren't called.
- The full run, including the unsent result, is captured in the run trace for review.
Promote to live in one click once you're satisfied the agent is producing the right output. Until then, you can iterate on the prompt, tools, and acceptance criteria with zero blast radius.
Shadow mode is especially useful when:
- Rolling out an automation to a new team for the first time.
- Iterating on a prompt against real production events you can't easily replay.
- Validating that acceptance criteria are tuned correctly before production traffic depends on them.
Example Automations
- Daily Digest: Cron trigger at 8 AM, agent summarizes new documents from the past 24 hours, delivers via Slack
- Ticket Triage: Webhook trigger from Zendesk, agent classifies and routes the ticket, delivers via webhook back to Zendesk
- Weekly Report: Cron trigger every Monday, agent compiles metrics from connected databases, delivers via email
- Deep Research (auto-resume): Daily cron triggers a multi-step competitive analysis. If the run exhausts its budget mid-research, it pauses; the next tick resumes it with a fresh budget delta, preserving everything already gathered
Best Practices
Agent Design
- Single purpose: One agent per major use case
- Minimal tools: Only enable needed tools
- Clear prompts: Specific, actionable instructions
- Scoped access: Limit to relevant collections
Using Agents Effectively
- Be specific: Detailed questions get better results
- Provide context: Mention relevant timeframes, projects
- Review sources: Verify agent findings in cited documents
- Iterate: Ask follow-up questions for depth
When to Use Agents vs Chat
| Use Agent | Use Direct Chat |
|---|---|
| Multi-step research | Simple fact lookup |
| Data analysis | Quick questions |
| Comparative studies | Definition queries |
| Report generation | Document retrieval |
Troubleshooting
Agent Not Responding
- Check iteration/timeout limits
- Verify tool permissions
- Ensure collections have content
- Try simplifying the question
Incorrect Tool Usage
- Review system prompt clarity
- Check tool selection
- Verify collection scope
- Adjust prompt instructions
Slow Execution
- Complex queries take longer
- Database queries may be slow
- Large collections need more search time
- Consider narrowing scope
Next Steps
- Canvas & Artifacts - Persistent content objects
- Integrations - Agent tool integrations
- Knowledge - Manage data sources
- Search - Advanced search features
- API - Agent API reference