Skip to content

Knowledge Base

The Knowledge Base allows your agent to access documents, tables, and internal websites using RAG (Retrieval-Augmented Generation). Instead of relying solely on prompts, the agent performs semantic searches on vectors representing your content, retrieving relevant snippets and inserting them into the conversation context.

What is RAG?
RAG combines document retrieval (via semantic similarity) with text generation. You store your materials as vectors; when a question is received, the agent searches for the most similar vectors and, using those snippets, generates more accurate and grounded answers.
🔗 Learn more about RAG (Wikipedia)

Usage Examples

  • Internal manuals and policies: PDFs, DOCX, Markdown.
  • FAQs and support bases: CSV or JSON exported from legacy systems.
  • Online documentation: site crawler for intranet or portals.
  • Metrics dashboards: spreadsheets with indicators the agent can query.

Available Embedding Models

The choice of embedding model directly impacts accuracy, cost, and vectorization time:

  • Large Models

    text-embedding-3-large

    • Pros: captures complex semantic nuances, ideal for long or highly technical texts.
    • Cons: high cost (R$1.20/1M tokens) and slower vectorization.
  • Medium Models

    bge-en-icl

    • Pros: good balance between accuracy and cost (R$0.185/1M tokens).
    • Cons: slight loss of accuracy in extremely specific cases.
  • Small Models

    text-embedding-3-small

    • Pros: low cost (R$0.185/1M tokens) and fast vectorization, ideal for very large volumes.
    • Cons: lower fidelity in detailed semantic similarity.
ModelCostRecommended Use
text-embedding-3-largeR$1.20 / 1M tok.Long documents, high semantic sensitivity
bge-en-iclR$0.185 / 1M tok.General applications, good cost/accuracy trade-off
text-embedding-3-smallR$0.185 / 1M tok.Huge volumes, minimal cost

Attention: changing the model recreates all table vectors, generating additional costs. Check usage in the Dashboard after synchronization.

Creating a Table

Why “table” and not “document”?

Each table is a set of rows (or “chunks”) vectorized — think of it as an internal CSV:

  • Flexibility: you can merge multiple files, spreadsheets, and pages into a single source.
  • Granularity: each row is a text fragment, preventing the agent from bringing irrelevant snippets or having to read an entire document to find an answer.
  • Editing: edit, remove, or add individual rows without reimporting everything.
  • Visualization: The table representation allows you to intuitively understand how data will be queried and used by the agent.

Click + Create Table (🔝 top right corner) and choose:

Empty Table

  • Name: unique identifier (snake_case).
  • Description: guides the agent when deciding to use this table.
  • Embedding Model: select one of the models above.
  • Save: the table appears empty; proceed to import or insert manually.

Upload File

  • Formats: PDF, DOCX, TXT, MD, JSON, CSV.
  • Set Name, Description, and Model.
  • Upload or drag the file.
  • Split by page (applies only to PDF and DOCX): splits the document by page into chunks of up to 20,000 characters, with 100 characters overlap by default.
  • Save: populates the table with these chunks.

Import Website

  • URL: starting address.
  • Content Selectors: e.g. body, article, #main. Extracts only the text within that selector.
  • Depth: how many levels of internal links to follow (e.g. 2 = homepage + links from those pages).
  • Max Links: limit of URLs per page (e.g. 50) to avoid infinite crawling.
  • Set Name, Description, Model, and Save.

Slow Import

Bulk upload and web-scraping may take several minutes, as each file or page is split, vectorized, and sent for storage.

⚠️ Warning
If the site changes, the imported content does not update automatically. Reimport and synchronize to reflect changes.

Viewing and Editing

Click the table name to open the management interface:

Overview

  • Table Title and ✏️ icon to edit Name, Description, and Model.
  • Document (row) counter and model in use.

Action Bar

  • 🔁 Restore: undoes local changes and returns to the last synchronized state.
  • 🔍 Query: opens the semantic search modal (see section 5).
  • ⬇️ Download: exports the entire table as CSV.
  • Import: add more files or websites without leaving the page.
  • 🔄 Synchronize: reprocesses all fragments and updates vectors.

Rows (Chunks)

  • Each row displays a text fragment (+ optional metadata).
  • 🗑️ Remove: deletes irrelevant fragments.
  • ✏️ Edit: manually adjust text or identification.
  • + Row: manually create new fragments.

Testing Semantic Search (Prototyping)

  1. Click Query.
  2. Enter your Query (question or term).
  3. Set Number of Results (N).
  4. Click Query and review:
    • Returned snippets with semantic distance (the lower, the closer).
    • Context: check if the agent will have enough information.

💡 Tips on Chunks and N

  • Maximum Chunk Size: ⚠️ Do not exceed 20,000 characters — embedding models have token limits and will fail to vectorize larger snippets.
  • Smaller Chunks: reduce token consumption but may lack context.
  • Larger Chunks: provide more context but cost more tokens.
  • N: using a low value (1–3) saves tokens; higher values improve coverage but increase cost.

Practical Example: “Vibe Criativa” FAQ

To illustrate the entire flow, use this FAQ CSV file (fictitious company Vibe Criativa):

🔗 Download the CSV: faq_vibe_criativa.csv

Step by Step

  1. Import the CSV

    • In Knowledge Base+ Create TableUpload File.
    • Name: faq_vibe_criativa
    • Description: “Frequently asked questions about Vibe Criativa’s services”
    • Model: text-embedding-3-large
    • Upload vibe_criativa_faq.csv and click Save.
  2. Prototype a Query

    • Click Query, enter contact, set 3 results, and confirm.
    • Check if the returned snippets match the contact information in the FAQ.

Tip: when using text-embedding-3-large, you’ll get more accurate answers for business questions, but vectorization and token cost will be higher. Adjust according to your volume and budget.

Querying via API

You can incorporate this query into any workflow in your system — CRM automations, Slack bots, webhook triggers, scheduled routines, dashboard integrations, etc. Just call the endpoint whenever you need data from your knowledge base.

Usage Example

Imagine a scheduled routine that sends an email with answers to the most frequently asked questions:

  1. Routine is executed
  2. Backend calls POST /v1/knowledge/faq_vibe_criativa/query for each question
  3. Email is sent with the answers from the returned JSON
typescript
// Semantic query via API
async function queryKB(
  tableName: string,
  query: string,
  n = 3
): Promise<any> {
  const res = await fetch(
    `https://api.sippulse.ai/v1/knowledge/${tableName}/query`,
    {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'api-key': process.env.SIPPULSE_API_KEY!
      },
      body: JSON.stringify({ query, n })
    }
  );
  if (!res.ok) throw new Error(`Error ${res.status}`);
  return res.json();
}

// Example usage
(async () => {
  const tableName = 'faq_vibe_criativa';
  const result = await queryKB(tableName, 'contact', 3);
  console.log(result);
})();
python
import os
import requests

def query_kb(table_name: str, query: str, n: int = 3) -> dict:
    """
    Semantic query via API.
    - table_name: knowledge table name
    - query: search text
    - n: number of results
    """
    url = f"https://api.sippulse.ai/v1/knowledge/{table_name}/query"
    resp = requests.post(
        url,
        headers={
            'Content-Type': 'application/json',
            'api-key': os.getenv('SIPPULSE_API_KEY')
        },
        json={'query': query, 'n': n}
    )
    resp.raise_for_status()
    return resp.json()

if __name__ == '__main__':
    table_name = 'faq_vibe_criativa'
    result = query_kb(table_name, 'contact', 3)
    print(result)

Example Response

json
// Expected response (simplified)
[
    {
        "content": "{\"question\":\"Does VibeCriativa have a WhatsApp support channel?\",\"answer\":\"Yes, our WhatsApp is (11) 98765-4321, available during business hours.\"}",
        "distance": 1.1357711969228341,
    },
    {
        "content": "{\"question\":\"Does VibeCriativa have success stories available?\",\"answer\":\"Yes, we have success stories available on the website and can send examples by email upon request.\"}",
        "distance": 1.200524422602051,
    },
    {
        "content": "{\"question\":\"How can I request a personalized quote?\",\"answer\":\"You can request a quote through the form on our website or by sending an email to orcamento@vibecriativa.com.\"}",
        "distance": 1.2048347495936975,
    }
]

Cost Control

  • Dashboard: monitor token usage and expenses per import/synchronization.
  • Auto-Sync: disable to avoid vectorization on every small adjustment.

Best Practices

  • Chunk size: 20,000 characters (max.) with 100 overlap usually works well.
  • Clear descriptions: help the agent filter the correct tables.
  • Conscious synchronization: use Auto-Sync for low-volatility bases.
  • Periodic validation: test key queries before releasing.

Frequently Asked Questions

Can I group multiple files into a single table?

Yes — import as many as you want and synchronize everything at once.


How do I know if a chunk is too large?

If the snippet doesn’t fit in the LLM’s context, reduce the character size.


What happens if a site changes?

Imported content does not update automatically: reimport and synchronize.


What is the impact of Auto-Sync?

Each edit triggers immediate vectorization, generating extra costs.