Knowledge Base
The Knowledge Base allows your agent to access documents, tables, and internal websites using RAG (Retrieval-Augmented Generation). Instead of relying solely on prompts, the agent performs semantic searches on vectors representing your content, retrieving relevant snippets and inserting them into the conversation context.
What is RAG?
RAG combines document retrieval (via semantic similarity) with text generation. You store your materials as vectors; when a question is received, the agent searches for the most similar vectors and, using those snippets, generates more accurate and grounded answers.
🔗 Learn more about RAG (Wikipedia)
Project Scoping
Knowledge bases belong to the active project. When you create a table, it is associated with the project currently selected in the Project Switcher. Members can only see and manage knowledge bases within their assigned projects.
The knowledge base list displays a Created by column showing which team member created each table. Non-admin users with the knowledge:read:own permission can filter the list to show only their own tables using the Created by me filter.
TIP
If you need the same knowledge base across multiple projects, you can export it (CSV) and re-import it into another project.
Usage Examples
- Internal manuals and policies: PDFs, DOCX, Markdown.
- FAQs and support bases: CSV or JSON exported from legacy systems.
- Metrics dashboards: spreadsheets with indicators the agent can query.
Available Embedding Models
The choice of embedding model directly impacts accuracy, cost, and vectorization time:
Large Models
text-embedding-3-large
- Pros: captures complex semantic nuances, ideal for long or highly technical texts.
- Cons: high cost (R$1.20/1M tokens) and slower vectorization.
Medium Models
bge-en-icl
- Pros: good balance between accuracy and cost (R$0.185/1M tokens).
- Cons: slight loss of accuracy in extremely specific cases.
Small Models
text-embedding-3-small
- Pros: low cost (R$0.185/1M tokens) and fast vectorization, ideal for very large volumes.
- Cons: lower fidelity in detailed semantic similarity.
| Model | Cost | Recommended Use |
|---|---|---|
| text-embedding-3-large | R$1.20 / 1M tok. | Long documents, high semantic sensitivity |
| bge-en-icl | R$0.185 / 1M tok. | General applications, good cost/accuracy trade-off |
| text-embedding-3-small | R$0.185 / 1M tok. | Huge volumes, minimal cost |
Attention: changing the model recreates all table vectors, generating additional costs. Check usage in the Dashboard after synchronization.
Creating a Table
Why “table” and not “document”?
Each table is a set of rows (or “chunks”) vectorized — think of it as an internal CSV:
- Flexibility: you can merge multiple files, spreadsheets, and pages into a single source.
- Granularity: each row is a text fragment, preventing the agent from bringing irrelevant snippets or having to read an entire document to find an answer.
- Editing: edit, remove, or add individual rows without reimporting everything.
- Visualization: The table representation allows you to intuitively understand how data will be queried and used by the agent.
Click + Create Table (🔝 top right corner) and choose:
Empty Table
- Name: unique identifier (
snake_case). - Description: guides the agent when deciding to use this table.
- Embedding Model: select one of the models above.
- Save: the table appears empty; proceed to import or insert manually.
Upload File
- Formats:
PDF,DOCX,TXT,MD,JSON,CSV. - Set Name, Description, and Model.
- Upload or drag the file.
- Split by page (applies only to PDF and DOCX): splits the document by page into chunks of up to 20,000 characters, with 100 characters overlap by default.
- Save: populates the table with these chunks.
Slow Import
Bulk file uploads may take several minutes, as each file is split, vectorized, and sent for storage.
Viewing and Editing
Click the table name to open the management interface:
Overview
- Table Title and ✏️ icon to edit Name, Description, and Model.
- Document (row) counter and model in use.
Action Bar
- 🔁 Restore: undoes local changes and returns to the last synchronized state.
- 🔍 Query: opens the semantic search modal (see section 5).
- ⬇️ Download: exports the entire table as CSV.
- ➕ Import: add more files without leaving the page.
- 🔄 Synchronize: reprocesses all fragments and updates vectors.
Rows (Chunks)
- Each row displays a text fragment (+ optional metadata).
- 🗑️ Remove: deletes irrelevant fragments.
- ✏️ Edit: manually adjust text or identification.
- ➕ + Row: manually create new fragments.
Testing Semantic Search (Prototyping)
- Click Query.
- Enter your Query (question or term).
- Set Number of Results (N).
- Click Query and review:
- Returned snippets with semantic distance (the lower, the closer).
- Context: check if the agent will have enough information.
💡 Tips on Chunks and N
- Maximum Chunk Size: ⚠️ Do not exceed 20,000 characters — embedding models have token limits and will fail to vectorize larger snippets.
- Smaller Chunks: reduce token consumption but may lack context.
- Larger Chunks: provide more context but cost more tokens.
- N: using a low value (1–3) saves tokens; higher values improve coverage but increase cost.
Practical Example: “Vibe Criativa” FAQ
To illustrate the entire flow, use this FAQ CSV file (fictitious company Vibe Criativa):
🔗 Download the CSV: faq_vibe_criativa.csv
Step by Step
Import the CSV
- In Knowledge Base → + Create Table → Upload File.
- Name:
faq_vibe_criativa - Description: “Frequently asked questions about Vibe Criativa’s services”
- Model: text-embedding-3-large
- Upload
vibe_criativa_faq.csvand click Save.
Prototype a Query
- Click Query, enter contact, set 3 results, and confirm.
- Check if the returned snippets match the contact information in the FAQ.
Tip: when using text-embedding-3-large, you'll get more accurate answers for business questions, but vectorization and token cost will be higher. Adjust according to your volume and budget.
Integrating with Agents
The most powerful way to use the Knowledge Base is to connect it directly to an Agent. When configured as a tool, the agent can automatically query the knowledge base during a conversation, retrieving relevant information without any manual intervention.
Adding a Knowledge Base Tool
- Navigate to Agents and select or create an agent
- Go to the Tools tab
- Click + New Tool
- Select Knowledge Base (RAG) as the module
- Configure the parameters:
| Parameter | Description |
|---|---|
| Knowledge Base | Select the table you want to connect |
| Number of Results | How many snippets to return (1–20). More results = more context, but higher cost |
| Additional Instructions | Guidance for the agent on when and how to use this tool |
- Click Save
How the Agent Uses the Knowledge Base
When a user asks a question:
- The agent evaluates whether the question can be answered using the knowledge base
- If so, it formulates an optimized semantic query
- The query returns the N most relevant snippets
- The agent uses those snippets as context to formulate an accurate response
User: "What are your business hours?"
↓
Agent formulates query: "business hours opening times support"
↓
Knowledge Base returns: 3 snippets about operating hours
↓
Agent responds: "Our support hours are..."Writing Effective Additional Instructions
Use the Additional Instructions field to guide the agent:
Instruction Examples
Use this tool whenever the user asks about company
policies, procedures, or internal information.Consult this knowledge base before answering questions
about products, pricing, or availability.If the information is not found in the knowledge base,
let the user know that you do not have that specific information.Multiple Knowledge Bases per Agent
An agent can have multiple knowledge base tools configured, each connected to a different table:
| Table | Recommended Use |
|---|---|
general_faq | Frequently asked questions about the company |
product_catalog | Product and pricing information |
hr_policies | Internal HR policies (for HR agents) |
technical_manual | Technical documentation (for support agents) |
The agent automatically selects which knowledge base to query based on the context of the question.
Integration Best Practices
- Clear table descriptions: The table description helps the agent decide when to use it
- Well-sized chunks: Chunks that are too large consume tokens; too small and they lose context
- Balanced N: Start with 3–5 results and adjust as needed
- Test in the Playground: Validate responses before deploying to production
- Monitor costs: Every query consumes embedding and LLM tokens
Querying via API
You can incorporate this query into any workflow in your system — CRM automations, Slack bots, webhook triggers, scheduled routines, dashboard integrations, etc. Just call the endpoint whenever you need data from your knowledge base.
Usage Example
Imagine a scheduled routine that sends an email with answers to the most frequently asked questions:
- Routine is executed
- Backend calls
POST /v1/knowledge/faq_vibe_criativa/queryfor each question - Email is sent with the answers from the returned JSON
// Semantic query via API
async function queryKB(
tableName: string,
query: string,
n = 3
): Promise<any> {
const res = await fetch(
`https://api.sippulse.ai/v1/knowledge/${tableName}/query`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'api-key': process.env.SIPPULSE_API_KEY!
},
body: JSON.stringify({ query, n })
}
);
if (!res.ok) throw new Error(`Error ${res.status}`);
return res.json();
}
// Example usage
(async () => {
const tableName = 'faq_vibe_criativa';
const result = await queryKB(tableName, 'contact', 3);
console.log(result);
})();import os
import requests
def query_kb(table_name: str, query: str, n: int = 3) -> dict:
"""
Semantic query via API.
- table_name: knowledge table name
- query: search text
- n: number of results
"""
url = f"https://api.sippulse.ai/v1/knowledge/{table_name}/query"
resp = requests.post(
url,
headers={
'Content-Type': 'application/json',
'api-key': os.getenv('SIPPULSE_API_KEY')
},
json={'query': query, 'n': n}
)
resp.raise_for_status()
return resp.json()
if __name__ == '__main__':
table_name = 'faq_vibe_criativa'
result = query_kb(table_name, 'contact', 3)
print(result)Example Response
// Expected response (simplified)
[
{
"content": "{\"question\":\"Does VibeCriativa have a WhatsApp support channel?\",\"answer\":\"Yes, our WhatsApp is (11) 98765-4321, available during business hours.\"}",
"distance": 1.1357711969228341,
},
{
"content": "{\"question\":\"Does VibeCriativa have success stories available?\",\"answer\":\"Yes, we have success stories available on the website and can send examples by email upon request.\"}",
"distance": 1.200524422602051,
},
{
"content": "{\"question\":\"How can I request a personalized quote?\",\"answer\":\"You can request a quote through the form on our website or by sending an email to orcamento@vibecriativa.com.\"}",
"distance": 1.2048347495936975,
}
]Cost Control
- Dashboard: monitor token usage and expenses per import/synchronization.
- Auto-Sync: disable to avoid vectorization on every small adjustment.
Best Practices
- Chunk size: 20,000 characters (max.) with 100 overlap usually works well.
- Clear descriptions: help the agent filter the correct tables.
- Conscious synchronization: use Auto-Sync for low-volatility bases.
- Periodic validation: test key queries before releasing.
Frequently Asked Questions
Can I group multiple files into a single table?
Yes — import as many as you want and synchronize everything at once.
How do I know if a chunk is too large?
If the snippet doesn’t fit in the LLM’s context, reduce the character size.
What is the impact of Auto-Sync?
Each edit triggers immediate vectorization, generating extra costs.
