Build Unified Distilled Knowledge Graph with Human/AI Source Fusion

## Overview
Build a sophisticated knowledge fusion system that creates a deduplicated, weighted graph blending human and AI insights with transparent provenance tracking.

## Architecture

### Storage Layers
```
Raw Layer (append-only):
- events/hot.jsonl         # Human thoughts (weight: 1.0)
- derived/approved.jsonl   # AI findings (weight: varies)
- derived/rejected.jsonl   # Learning data

Fusion Layer (periodic rebuild):
- distilled/concepts.jsonl       # Unified concepts with weights
- distilled/relations.jsonl      # Connections between concepts
- distilled/provenance.jsonl     # Source tracking
```

### Example Unified Concept
```json
{
  "id": "evolutionary-ethics",
  "canonical_form": "Evolutionary Ethics",
  "description": "Morality emerging from evolutionary dynamics",
  "weight": 0.85,
  "sources": {
    "human": {
      "count": 5,
      "events": ["2025-06-10T17:03:28Z", "2025-06-10T17:01:36Z"],
      "weight_contribution": 0.6
    },
    "ai": {
      "count": 2,
      "findings": ["2025-06-10T18:02:26Z-pattern-analysis"],
      "weight_contribution": 0.4
    }
  },
  "confidence": 0.92,
  "last_updated": "2025-06-10T18:30:00Z"
}
```

## Key Features

### 1. Dual-Write System
- Summary in hot.jsonl for discoverability
- Full finding in derived/approved.jsonl with complete metadata
- Bidirectional linking via derived_ref

### 2. Relative Contribution Tracking
- Human contributions weighted as ground truth (1.0)
- AI contributions weighted by approval confidence and recency
- Transparent source attribution showing percentage from each source

### 3. Deduplication Strategy
- Concept clustering using embeddings/similarity
- Canonical form selection (prefer human-originated terms)
- Alias tracking for all variations

### 4. Rejection Learning
- Track rejected findings in derived/rejected.jsonl
- Include rejection reasons and improvement suggestions
- Use for ML training and analysis improvement

## Implementation Steps

### Phase 1: Foundation (Immediate)
- [ ] Update review-findings to implement dual-write
- [ ] Add enriched metadata tracking (source events, analysis parameters)
- [ ] Implement rejected findings storage

### Phase 2: Distillation Process
- [ ] Build concept extraction and clustering
- [ ] Implement weight calculation algorithm
- [ ] Create periodic distillation background job

### Phase 3: Query Integration
- [ ] Update query tools to search unified graph
- [ ] Add provenance display options
- [ ] Implement confidence-based filtering

### Phase 4: Evolution Tracking
- [ ] Track concept strength over time
- [ ] Visualize knowledge graph growth
- [ ] Identify emerging themes

## Benefits
1. **Trust Indicators** - Clear visibility of human vs AI contributions
2. **Feedback Loop** - Rejected findings improve future analysis
3. **Clean Queries** - Single deduplicated concept instead of fragments
4. **Knowledge Evolution** - Watch insights strengthen and connect over time
5. **Transparent Provenance** - Always know where knowledge originated

## Related Issues
- Depends on completion of #17 (review-findings fixes)
- Enhances #10 (derived knowledge pipeline)
- Supports long-term knowledge curation goals

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Build Unified Distilled Knowledge Graph with Human/AI Source Fusion #18

Overview

Architecture

Storage Layers

Example Unified Concept

Key Features

1. Dual-Write System

2. Relative Contribution Tracking

3. Deduplication Strategy

4. Rejection Learning

Implementation Steps

Phase 1: Foundation (Immediate)

Phase 2: Distillation Process

Phase 3: Query Integration

Phase 4: Evolution Tracking

Benefits

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Build Unified Distilled Knowledge Graph with Human/AI Source Fusion #18

Description

Overview

Architecture

Storage Layers

Example Unified Concept

Key Features

1. Dual-Write System

2. Relative Contribution Tracking

3. Deduplication Strategy

4. Rejection Learning

Implementation Steps

Phase 1: Foundation (Immediate)

Phase 2: Distillation Process

Phase 3: Query Integration

Phase 4: Evolution Tracking

Benefits

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions