[feature] cross-pipe indexes SDK: enabling context-aware AI tools

### context

AI is as good as the context you provide it, regardless of it's architecture, weights, training, **gpt-42 will be as good as the context you provide it**.

screenpipe builds layers of abstractions on top of raw recordings. pipes create valuable contextual data that could be indexed and queried by other pipes, similar to how AI assistants use tools to access different knowledge bases. this proposal aims to standardize how pipes share and consume these contextual indexes.

![Image](https://github.com/user-attachments/assets/a5892a8c-01ab-4827-8754-e6e8d4f3eaaf)


### problem
- valuable contextual data is siloed within individual pipes
- no standardized way to index and query cross-pipe data
- missing opportunities for AI-driven context enrichment
- current sharing methods are hacky
- pipes reinvent the wheel for common patterns

### proposed solution
create an indexes SDK that allows pipes to:
1. publish local indexes (abstracted data)
2. AI can autonomously query other pipes' indexes via tools / best AI engineering practices
3. subscribe to index updates

### core indexes examples

```typescript
// common index types that pipes can publish and consume
type IndexTypes = {
  // activity patterns
  'activity.summary': {
    interval: '5min' | '15min' | '1hour',
    timestamp: number,
    data: {
      tags: string[],
      summary: string,
      apps: string[],
      focus_level: number // 0-1
    }
  },
  
  // knowledge/notes
  'knowledge.chunk': {
    timestamp: number,
    data: {
      content: string,
      tags: string[],
      source: string,
      type: 'note' | 'document' | 'chat' | 'email'
    }
  },

  // communication style
  'communication.style': {
    timestamp: number,
    data: {
      tone: string[],  // ['formal', 'casual', 'technical']
      common_phrases: string[],
      writing_patterns: {
        avg_sentence_length: number,
        vocabulary_level: string
      }
    }
  },

  // task context
  'task.item': {
    timestamp: number,
    data: {
      title: string,
      status: 'todo' | 'in_progress' | 'done',
      priority: number,
      context: string,
      source: string
    }
  }
}
```

### example pipes & integrations

1. **engineering assistant pipe**
```typescript
const engineeringPipe = {
  async suggestIssueComment(issueUrl: string) {
    // get relevant technical context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical', 'architecture']
    })
    
    // get user's communication style
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastMonth
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      status: 'in_progress',
      tags: ['engineering']
    })
    
    return generateTechnicalComment(issueUrl, {
      context: techContext,
      style,
      relatedTasks: tasks
    })
  }
}
```

2. **task extraction pipe**
```typescript
const taskPipe = {
  async extractTasks() {
    // analyze recent activity
    const activities = await pipe.indexes.query('activity.summary', {
      timeRange: lastHour
    })
    
    // get communication context
    const communications = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastHour,
      type: ['chat', 'email']
    })
    
    const tasks = identifyTasks(activities, communications)
    
    // publish new tasks
    await pipe.indexes.publish('task.item', tasks.map(t => ({
      timestamp: Date.now(),
      data: t
    })))
  }
}
```

3. **sales assistant pipe**
```typescript
const salesPipe = {
  async enhanceSalesCall(transcript: string) {
    // get customer interaction history
    const history = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['customer', 'sales']
    })
    
    // get product knowledge
    const productKnowledge = await pipe.indexes.query('knowledge.chunk', {
      tags: ['product', 'features', 'pricing']
    })
    
    // get communication patterns
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastWeek
    })
    
    return generateSalesInsights(transcript, {
      history,
      productKnowledge,
      style
    })
  }
}
```

4. **linear.app integration pipe**
```typescript
const linearPipe = {
  async enhanceTicket(ticketId: string) {
    // get engineering context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical']
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      tags: ['engineering']
    })
    
    // get team activity patterns
    const teamActivity = await pipe.indexes.query('activity.summary', {
      timeRange: lastDay,
      tags: ['engineering']
    })
    
    return generateTicketContext(ticketId, {
      techContext,
      relatedTasks: tasks,
      teamActivity
    })
  }
}
```

5. **meeting summarizer pipe**
```typescript
const meetingPipe = {
  async generateSummary(meetingId: string) {
    // get participant context
    const participants = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['profile', 'background']
    })
    
    // get project context
    const projectContext = await pipe.indexes.query('knowledge.chunk', {
      tags: ['project', 'objectives']
    })
    
    // get action items
    const tasks = await pipe.indexes.query('task.item', {
      status: 'todo'
    })
    
    return generateMeetingSummary(meetingId, {
      participants,
      projectContext,
      pendingTasks: tasks
    })
  }
}
```

### technical considerations

- local sqlite storage with efficient indexing
- standardized index schemas per pipe type
- real-time pub/sub for index updates
- typescript-first with zod validation
- privacy-preserving (100% local)
- efficient time-based querying
- support for full-text search
- support for vector embeddings
- support for metadata filtering

### implementation details

```typescript
// core SDK interface
interface IndexesSDK {
  // publishing
  publish(indexName: keyof IndexTypes, data: IndexData): Promise<void>
  
  // querying
  query(indexName: keyof IndexTypes, filters: {
    timeRange?: TimeRange
    tags?: string[]
    type?: string
    fullText?: string
    vector?: number[]
    metadata?: Record<string, any>
  }): Promise<IndexData[]>
  
  // subscriptions
  subscribe(indexName: keyof IndexTypes, callback: (data: IndexData) => void): () => void
  
  // schema validation
  validateSchema(indexName: keyof IndexTypes, data: any): boolean
}
```

### next steps

1. [ ] iterate & finalize design
2. [ ] implement it

### questions

- how should we handle index versioning?
- what's the optimal storage strategy for different index types? should we just store everything as file (eg obsidian file-first approach) or can we just sqlite or such less lindy solutions?
- how to handle data retention? what about memories from last 90 days?
- should we add data transformation utilities?








Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

context

problem

proposed solution

core indexes examples

example pipes & integrations

technical considerations

implementation details

next steps

questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

Description

context

problem

proposed solution

core indexes examples

example pipes & integrations

technical considerations

implementation details

next steps

questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions