RAG Configuration

Retrieval Augmented Generation (RAG) is the core technology that allows AI agents to leverage your organization’s knowledge & database. Prisme.ai provides two powerful approaches to customize RAG behavior: YAML-based tools and webhooks. These methods give you granular control over every aspect of the RAG pipeline.

Configuration Approaches

AI Knowledge - No-Code UI
AI Builder - YAML Tools
AI Builder - Webhooks

For basic configurations, you can use the built-in UI settings:

Instructions
- Keep clear and precise instructions with Titles and Sections.
  - Specify language (Reply in the user language, reply in English only, …)
  - Tone (Friendly, empathic, formal, …)
  - Output instructions (concise, detailled, …)
- Documents and tools results placed in the $ keyword
- Date placed in the $ keyword. Specify your users timezone.
Text Splitter Configuration
- Chunk Size
- Chunk Overlap
- Enable override by document
Embeddings Settings
- Number of chunks to retrieve
Self-Query
- Enable filtering on tags set on your documents
- Configure from AI Store input
- Used:
  - Automatically, in AI > Self Query > Enabled, so the AI dynamically chooses which tags to use for a query
  - Set by the user, In AI > Self Query > Enabled by the user. Users will see a ”+” button in the AI Store to add tags.
  - Set mandatory tags for user or groups, in the User sharing page.
Query Enhancement
- Select model
- Add instructions and definitions
Post-Processing
- Show suggested questions
- Filter displayed sources

For advanced customization, create YAML-based tools to override specific parts of the RAG pipeline:

# Example tool definition
slug: query-reformulation
name: AIK/Tools/QueryReformulation
do:
  - Knowledge Client.chat-completion:
      messages:
        - role: assistant
          content: 'Given the following user question, generate 3 alternative phrasings that express the same intent. Return the result as a JSON object with a "reformulations" key.'
        - role: user
          content: 'Question: "{{user_question}}"'
      output: reformulations
  - set:
      name: output
      value:
        value: '{{reformulations}}'
        description: 'alternative phrasings of user question'

YAML tools give you complete control with the full power of AI Builder.To read the filtered tags of the self-query feature in a custom tool, access body.metadata.aikTags from your tool’s automation.

For external processing or integration with existing systems, configure webhooks to intercept and modify RAG behavior:

// Example webhook response to override retrieved chunks
{
  "chunks": [
    {
      "value": {
        "content": "Custom content to be injected into the LLM prompt",
        "knowledgeId": "document-id-123"
      }
    }
  ]
}

Webhooks provide maximum flexibility and integration capabilities.

RAG Pipeline Components

The RAG pipeline in Prisme.ai consists of several stages, each of which can be customized using YAML tools or webhooks:

Query Processing

Transform and enhance the user’s question before retrieval

Retrieval

Find relevant documents in your knowledge base

Context Assembly

Organize retrieved documents into a coherent context

Prompt Generation

Create the prompt that will be sent to the LLM

Response Generation

Generate the final answer using the LLM

Post-Processing

Enhance the response with additional information or formatting

Documents Tags

Documents can have tags assigned to them. These tags can be used to reduce the load on the embedding model when retrieving documents by narrowing the search scope. They can also help avoid conflicts when multiple documents express opposing statements depending on the context. You have three ways to use tags:

Automatic:
- Enabled in AI > Self Query > Enabled. The AI automatically determines which tags to use for each query.
User-defined:
- Enabled in AI > Self Query > Enabled by the user. Users can manually add tags using the “+” button available in the AI Store.
Mandatory (by user or group):
- Configured in the User sharing page, where you can assign required tags to specific users or groups.

Automatic
Set by the user
Mandatory for a user or group

Enable the self-query option in AI > Self Query > Enabled to let the LLM automatically determine the relevant tags for each question.
The system will then filter documents dynamically based on the inferred tags.

When multiple tag sources are combined, the logic follows this order:

All tags defined in the User panel are combined with OR logic.
All tags selected in the AI Store (chat interface) are also combined with OR logic.
Then, both resulting sets are combined using AND logic.

This means: A user with mandatory tags [A, B] and selected tags [C, D] will retrieve documents that contain at least one tag from each set, resulting in the following combinations: [A, C], [A, D], [B, C], [B, D].

YAML Tool Examples

Query Reformulation

This tool enhances user queries by generating alternative phrasings:

slug: query-reformulator
name: AIK/Tools/QueryReformulator
do:
  - Knowledge Client.chat-completion:
      messages:
        - role: assistant
          content: 'Given the following user question, generate 3 alternative phrasings that express the same intent. Return the result as a JSON object with a "reformulations" key.'
        - role: user
          content: 'Question: "{{user_question}}"'
      output: reformulations
  - set:
      name: output
      value:
        value: '{{reformulations}}'
        description: 'alternative phrasings of user question'

Usage scenario: Improve retrieval quality for ambiguous or tersely worded queries.

Web Search Tool

This tool augments the knowledge base with real-time web search results:

slug: Web-browsing
name: AIK/Tools/Web browsing
do:
  - conditions:
      '{{body.projectId}} != {{config.agentCredentials.projectId}} || !{{run.sourceWorkspaceId}} || {{run.sourceWorkspaceId}} != {{global.workspacesRegistry["ai-knowledge"].id}}':
        - set:
            name: output
            value:
              error: Forbidden
        - break: {}
  - Serper.run:
      type: '{{body.arguments.type}}'
      q: '{{body.arguments.query}}'
      ql: '{{body.arguments.country}}'
      location: '{{body.arguments.location}}'
      num: 20
      output: search
      hl: '{{body.arguments.lg}}'
      tbs: '{{body.arguments.tbs}}'
  - set:
      name: currentDate
      value: '{% date({{run.date}}).iso %}'
  - set:
      name: output
      value:
        value: '{{search}}'
        description: List of search results
validateArguments: true
arguments:
  body:
    type: object
    properties:
      arguments:
        type: object
        properties:
          type:
            type: string
            enum:
              - search
              - news
              - images
              - videos
              - places
              - reviews
              - scholar
              - patents
              - shopping
          query:
            type: string
            title: User query
          country:
            type: string
            title: Country
          lg:
            type: string
            title: lg
          location:
            type: string
            title: location
          tbs:
            title: Date range
            type: string
            enum:
              - ''
              - qdr:h
              - qdr:d
              - qdr:w
              - qdr:m
              - qdr:y
            enumNames:
              - Any time
              - Past hours
              - Past 24 hours
              - Past week
              - Past month
              - Past year
description: Return list of web browsing results.
output:
  type: tool_result
  output: '{{output}}'
when:
  endpoint: true
labels:
  - tools

Usage scenario: Supplement your knowledge base with up-to-date information from the web.

Semantic Chunking

This tool implements intelligent document chunking based on semantic boundaries:

slug: semantic-chunker
name: AIK/Tools/SemanticChunker
do:
  - Knowledge Client.chat-completion:
      messages:
        - role: system
          content: "You are a document processing assistant. Your task is to split the document into semantic chunks maintaining coherent topics and context. Each chunk should be between 200-500 tokens."
        - role: user
          content: "Please split the following document into semantic chunks:\n\n{{document_text}}"
      output: chunks_result
  - set:
      name: output
      value:
        value: '{{chunks_result}}'
        description: 'Document split into semantic chunks'

Usage scenario: Improve retrieval quality for documents with complex structure or mixed topics.

Dynamic RAG Parameters

This tool dynamically adjusts retrieval parameters based on query complexity:

slug: dynamic-rag-params
name: AIK/Tools/DynamicRAGParams
do:
  - Knowledge Client.chat-completion:
      messages:
        - role: system
          content: "Analyze the user query and determine optimal RAG parameters. Return a JSON object with: {\"chunks\": number of chunks to retrieve (1-10), \"threshold\": minimum similarity score (0.1-0.9), \"diversity\": diversity factor (0.1-0.9)}"
        - role: user
          content: "Query: {{user_question}}"
      output: rag_params
  - set:
      name: output
      value:
        value: '{{rag_params}}'
        description: 'Dynamically optimized RAG parameters'

Usage scenario: Automatically optimize retrieval for different query types (simple vs. complex, factual vs. exploratory).

Context Fusion

This tool combines information from multiple retrieved chunks into a coherent context:

slug: context-fusion
name: AIK/Tools/ContextFusion
do:
  - Knowledge Client.chat-completion:
      messages:
        - role: system
          content: "You are a document synthesis expert. Your task is to combine the following document chunks into a coherent, non-redundant context that preserves all relevant information. Focus on removing duplicated information and creating logical transitions between topics."
        - role: user
          content: "User query: {{user_question}}\n\nDocument chunks:\n{{retrieved_chunks}}"
      output: fused_context
  - set:
      name: output
      value:
        value: '{{fused_context}}'
        description: 'Synthesized context from multiple sources'

Usage scenario: Create more coherent context for complex queries that require information from multiple documents.

Webhook Integration

Webhooks provide an alternative approach to customizing the RAG pipeline by intercepting key events and modifying the behavior via external HTTP endpoints.

Webhook Configuration

To configure a webhook for your AI Knowledge agent:

Go to your agent’s settings
Navigate to the “Webhooks” section
Enter your HTTPS webhook URL
Select which events to subscribe to

All webhook endpoints must use HTTPS and respond within 30 seconds to avoid timeouts.

Document Events

Webhooks can intercept document creation, update, and deletion events:Request Example (Document Creation):

{
  "type": "documents_created",
  "projectId": "project-id-123",
  "sessionId": "admin-session-id",
  "project": {},
  "payload": {
    "id": "doc-id-123",
    "document": {
      "name": "Sample Document",
      "text": "Document content...",
      "source": {
        "url": "https://example.com/doc"
      },
      "tags": ["product", "technical"],
      "textSplitter": {},
      "status": "published"
    }
  }
}

Response Example (Modify Document):

{
  "status": "published",
  "text": "Modified document content...",
  "name": "Processed Document",
  "tags": ["product", "technical", "verified"],
  "textSplitter": {
    "chunkSize": 300,
    "chunkOverlap": 50
  }
}

This allows you to:

Preprocess documents before indexing
Add or modify metadata
Override chunking settings per document
Reject documents that don’t meet criteria

Query Interception

Webhooks can intercept user queries and modify various aspects of the RAG process:Request Example:

{
  "type": "queries",
  "projectId": "project-id-123",
  "sessionId": "user-session-id",
  "project": {},
  "payload": {
    "input": "How do I reset my password?",
    "messageId": "msg-id-123",
    "history": []
  }
}

Response Options:

Override Retrieved Context:

{
  "chunks": [
    {
      "value": {
        "content": "To reset your password, go to Settings > Security > Reset Password. You'll receive an email with instructions.",
        "knowledgeId": "doc-id-456"
      }
    }
  ]
}

Override Prompt Generation:

{
  "prompt": [
    {
      "role": "system",
      "content": "You are a helpful assistant specializing in account security. Use the following context to answer the user's question: [Context: Password resets require verification via email. The reset link expires in 24 hours.]"
    },
    {
      "role": "user",
      "content": "How do I reset my password?"
    }
  ]
}

Override Answer Generation:

{
  "answer": "To reset your password:\n1. Go to Settings > Security\n2. Click 'Reset Password'\n3. Check your email for instructions\n4. Follow the link within 24 hours\n\nIf you need further assistance, contact support@example.com."
}

Override AI Parameters:

{
  "aiParameters": {
    "model": "gpt-4",
    "prompt": "Custom prompt template...",
    "max_tokens": 5000,
    "temperature": 0.9,
    "history": false
  }
}

Override Search Results:

{
  "searchResults": [
    {
      "url": "https://help.example.com/password-reset",
      "content": "Learn how to reset your password and recover access to your account.",
      "name": "Password Reset Guide"
    }
  ]
}

Test Results Webhook

Webhooks can also intercept test results for analysis and evaluation:Request Example:

{
  "type": "tests_results",
  "projectId": "project-id-123",
  "sessionId": "admin-session-id",
  "project": {},
  "payload": {
    "test": {
      "question": "How do I update my email preferences?",
      "referenceAnswer": "Go to your profile settings and select 'Email Preferences' to update your subscription settings.",
      "answer": "To update your email preferences, navigate to your profile settings and click on the 'Email Preferences' tab where you can modify your subscription settings.",
      "context": "Generated context content..."
    }
  }
}

Response Example:

{
  "analysis": "The answer is correct and comprehensive, covering how to find and update email preferences. The content aligns well with the reference answer.",
  "score": 2,
  "context": 1.5
}

This allows you to:

Implement custom evaluation metrics
Track test results in external systems
Apply domain-specific scoring criteria

Combining YAML Tools and Webhooks

For the most sophisticated RAG configurations, you can combine YAML tools and webhooks:

Sequential Pipeline

Chain multiple YAML tools to create a sequential processing pipeline, with webhooks for external integration at key points.Example: YAML tool for query reformulation → webhook for sensitive query detection → YAML tool for retrieval customization

Fallback Mechanisms

Configure webhooks as fallbacks when YAML tools don’t produce satisfactory results.Example: Try native retrieval first, but if no good matches are found, call webhook to query external knowledge bases

A/B Testing

Use different YAML tools or webhooks based on query characteristics or for experimentation.Example: Route technical questions through one pipeline and customer support questions through another

Hybrid Processing

Let webhooks handle some RAG components while YAML tools handle others.Example: Webhook handles retrieval from proprietary databases, YAML tool handles context optimization

Best Practices

YAML Tool Development

Start with simple tools and incrementally add complexity
Use a consistent naming convention for tools
Thoroughly test tools with varied inputs
Document each tool’s purpose and expected inputs/outputs
Consider performance implications for complex processing

Webhook Implementation

Ensure webhooks are hosted on reliable, low-latency infrastructure
Implement proper error handling and fallbacks
Cache results when appropriate to improve response times
Use secure authentication to protect sensitive data
Monitor webhook performance and error rates

RAG Pipeline Design

Clearly define which components require customization
Choose the appropriate approach (UI, YAML, webhook) based on complexity
Test changes incrementally to isolate effects
Monitor key metrics before and after changes
Document your RAG pipeline configuration for maintainability

Next Steps

Tools Integration

Learn more about integrating specialized tools with AI Knowledge

Agent Testing

Test and validate your RAG configuration

Advanced RAG

Explore sophisticated RAG architectures

Analytics

Monitor your RAG pipeline’s performance

Overview

AI SecureChat

AI Store

AI Knowledge

AI Builder

AI Governance

AI Collection (beta)

AI Insights (beta)

Configuration Approaches

RAG Pipeline Components

Documents Tags

YAML Tool Examples

Webhook Integration

Combining YAML Tools and Webhooks

Sequential Pipeline

Fallback Mechanisms

A/B Testing

Hybrid Processing

Best Practices

Next Steps

Tools Integration

Agent Testing

Advanced RAG

Analytics

Overview

AI SecureChat

AI Store

AI Knowledge

AI Builder

AI Governance

AI Collection (beta)

AI Insights (beta)

​Configuration Approaches

​RAG Pipeline Components

​Documents Tags

​YAML Tool Examples

​Webhook Integration

​Combining YAML Tools and Webhooks

Sequential Pipeline

Fallback Mechanisms

A/B Testing

Hybrid Processing

​Best Practices

​Next Steps

Tools Integration

Agent Testing

Advanced RAG

Analytics

Configuration Approaches

RAG Pipeline Components

Documents Tags

YAML Tool Examples

Webhook Integration

Combining YAML Tools and Webhooks

Best Practices

Next Steps