[Project] Agent Knowledge v1 - Budibase AI first approach

# Knowledge (v1) – Product Requirements

## Goal
Enable Budibase agents to answer questions using company knowledge stored in common systems, with the simplest possible setup for SME teams.

Users should be able to connect their knowledge sources, select what the agent can access, and immediately ask questions grounded in their documents.

The system should be simple to configure, reliable, and transparent, without exposing complex AI or retrieval infrastructure.

---

## Problem

Operations teams store critical knowledge across multiple systems such as:

- SharePoint
- Google Drive
- Confluence
- PDFs and documents

Today:

- Agents do not have access to this knowledge.
- Users must manually copy information into instructions or prompts.
- There is no reliable way for agents to reference internal documents.

Building a full custom RAG infrastructure (embeddings, vector DB, retrieval pipelines) would add significant engineering complexity and slow delivery.

We need a simple knowledge layer that allows agents to retrieve relevant information from documents while keeping the system easy to operate and maintain.

---

## Solution

Provide a Knowledge system for agents built on:

### 1. Gemini Flash for reasoning
Agents use Gemini Flash to interpret questions and generate answers.

### 2. Google File Search for retrieval
Google File Search will manage:

- document chunking
- embeddings
- indexing
- semantic retrieval

Budibase uploads synced documents to File Search and retrieves relevant context during agent responses.

### 3. First-party knowledge connectors

Support three initial connectors:

- SharePoint
- Google Drive
- Confluence

Each connector follows the same pattern:

**Connection**
- OAuth connection to the system

**Scope**
- User selects which folders, spaces, or libraries to include

**Sync**
- Manual sync button
- Automatic refresh on a schedule

**Access**
- User selects which agents can access the knowledge source

### 4. Manual + scheduled sync (not live sync)

For v1:

- user triggers **Sync now**
- automatic refresh runs periodically (24 hrs? 12hrs?)
- only changed files are reindexed

This avoids building complex real-time sync infrastructure.

---

## No gos

The following will not be included in v1:

- Live or real-time document sync
- Webhook-driven updates
- Fine-grained upstream permission mirroring
- Custom vector databases
- Retrieval configuration such as top-k, chunk size, or embeddings
- User-facing search UI
- Per-document access control

The system should prioritize simplicity over completeness.

---

## Prior art

Examples of similar systems:

- OpenAI File Search – Managed RAG layer for assistants
- Glean – Enterprise knowledge retrieval platform
- Dust – AI assistants with connected knowledge sources
- ChatGPT Enterprise – Connectors for Google Drive, SharePoint, etc.

Most modern AI platforms follow the same pattern:

Knowledge sources  
↓  
Connector + sync  
↓  
Managed retrieval system  
↓  
LLM reasoning  

Budibase follows this model while integrating knowledge directly into agents.

---

## User flow

### 1. Add knowledge source

User navigates to:

Agents → Knowledge → Add source

Select provider:

- SharePoint
- Google Drive
- Confluence
- Upload files

---

### 2. Connect account

User completes OAuth connection.

Example:

Connect SharePoint  
→ Microsoft login  
→ grant access  

Connection is stored for the workspace.

---

### 3. Choose scope

User selects what the agent can access.

**SharePoint**
- site
- library
- folder

**Google Drive**
- specific folders

**Confluence**
- spaces

---

### 4. Sync knowledge

User runs:

**Sync now**

System:

- fetches documents
- uploads them to File Search
- indexes them

UI shows:

- Files indexed
- Last synced
- Sync status

---

### 5. Grant agent access

User selects which agents can use the knowledge source.

Example:

HR Agent → HR policies  
IT Agent → IT runbooks  

---

### 6. Agent answers questions

User asks:

What is our vacation policy?

System flow:

User question  
↓  
Agent query  
↓  
Google File Search retrieves relevant document chunks  
↓  
Gemini Flash generates answer  
↓  
Agent responds with grounded information  

---

## Success criteria

- Users can connect a knowledge source in under 3 minutes
- Agents can answer questions grounded in documents
- Sync process is transparent and reliable
- No AI infrastructure configuration required from users

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Project] Agent Knowledge v1 - Budibase AI first approach #18294

Knowledge (v1) – Product Requirements

Goal

Problem

Solution

1. Gemini Flash for reasoning

2. Google File Search for retrieval

3. First-party knowledge connectors

4. Manual + scheduled sync (not live sync)

No gos

Prior art

User flow

1. Add knowledge source

2. Connect account

3. Choose scope

4. Sync knowledge

5. Grant agent access

6. Agent answers questions

Success criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Project] Agent Knowledge v1 - Budibase AI first approach #18294

Description

Knowledge (v1) – Product Requirements

Goal

Problem

Solution

1. Gemini Flash for reasoning

2. Google File Search for retrieval

3. First-party knowledge connectors

4. Manual + scheduled sync (not live sync)

No gos

Prior art

User flow

1. Add knowledge source

2. Connect account

3. Choose scope

4. Sync knowledge

5. Grant agent access

6. Agent answers questions

Success criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions