Task 8: Documentation

# Task 8: Documentation

## Overview
Create comprehensive documentation for PVC storage support, including user guides, API documentation, troubleshooting guides, and architecture diagrams.

## Scope
- User documentation for PVC storage usage
- Developer documentation for implementation details
- Troubleshooting guide for common issues
- Architecture diagrams
- Migration guide from other storage types

## Files to Create/Modify
- `docs/user-guide/storage/pvc-storage.md` - User guide
- `docs/api/storage-types.md` - API documentation update
- `docs/troubleshooting/pvc-storage.md` - Troubleshooting guide
- `docs/architecture/pvc-storage-flow.md` - Architecture documentation
- `README.md` - Update main README with PVC storage mention

## Documentation Content

### User Guide: `docs/user-guide/storage/pvc-storage.md`
```markdown
# PVC Storage for Models

OME supports using Kubernetes Persistent Volume Claims (PVCs) as storage for machine learning models. This allows you to leverage existing models stored in PVCs without copying them to object storage.

## Overview

PVC storage is ideal when you:
- Have models already stored in PVCs backed by NFS, block storage, or other CSI drivers
- Want to avoid duplicating large models across multiple storage systems
- Need to use enterprise storage systems exposed through Kubernetes PVCs
- Require specific storage performance characteristics

## Prerequisites

1. A Kubernetes cluster with PVC support
2. Pre-existing PVCs containing model files
3. Models must include a `config.json` file for automatic metadata extraction

## PVC URI Format

```
pvc://{pvc-name}/{sub-path}
```

Examples:
- `pvc://model-storage/llama2-7b`
- `pvc://shared-models/foundation/nlp/bert-large`

## Creating a BaseModel with PVC Storage

### Basic Example

```yaml
apiVersion: ome.io/v1beta1
kind: BaseModel
metadata:
  name: llama2-7b-pvc
  namespace: default
spec:
  storage:
    storageUri: "pvc://llama2-model-pvc/models/llama2-7b"
  modelFormat:
    name: "safetensors"
```

The model metadata (type, architecture, parameters) will be automatically extracted from the model's `config.json` file.

### Manual Metadata Specification

If automatic extraction is not possible or desired, you can specify metadata manually:

```yaml
apiVersion: ome.io/v1beta1
kind: BaseModel
metadata:
  name: llama2-7b-pvc
  namespace: default
spec:
  storage:
    storageUri: "pvc://llama2-model-pvc/models/llama2-7b"
  modelFormat:
    name: "safetensors"
  modelType: "llama"
  modelArchitecture: "LlamaForCausalLM"
  modelParameterSize: "7B"
  modelCapabilities:
  - "text-generation"
  - "chat"
  maxTokens: 4096
```

## PVC Access Modes

OME handles different PVC access modes appropriately:

### ReadWriteMany (RWX)
- Multiple pods can mount the PVC simultaneously
- Ideal for shared model repositories
- InferenceService pods can run on any node

### ReadWriteOnce (RWO)
- Only one pod can mount the PVC at a time
- Suitable for high-performance block storage
- InferenceService scheduling is handled by Kubernetes

### ReadOnlyMany (ROX)
- Multiple pods can mount the PVC read-only
- Good for immutable model storage
- Similar behavior to RWX for model serving

## Using PVC Models in InferenceService

```yaml
apiVersion: ome.io/v1beta1
kind: InferenceService
metadata:
  name: llama2-serving
spec:
  model:
    name: llama2-7b-pvc
    runtime: vllm-runtime
  predictor:
    containers:
    - name: predictor
      args:
      - "--model"
      - "$(MODEL_PATH)"
      resources:
        limits:
          nvidia.com/gpu: "1"
```

The InferenceService will automatically mount the PVC and set the `MODEL_PATH` environment variable.

## ClusterBaseModel with PVC

For cluster-wide models, the PVC must exist in the configured cluster model namespace (typically `ome-system`):

```yaml
apiVersion: ome.io/v1beta1
kind: ClusterBaseModel
metadata:
  name: shared-llama2-7b
spec:
  storage:
    storageUri: "pvc://cluster-models-pvc/foundation/llama2-7b"
  modelFormat:
    name: "safetensors"
```

## Preparing PVCs for Model Storage

### Model Directory Structure

Your PVC should contain models with this structure:
```
/models/
├── llama2-7b/
│   ├── config.json          # Required for metadata extraction
│   ├── model.safetensors    # Model weights
│   ├── tokenizer.json       # Tokenizer configuration
│   └── tokenizer_config.json
└── bert-large/
    ├── config.json
    ├── pytorch_model.bin
    └── vocab.txt
```

### Example: Populating a PVC

```yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: download-model-to-pvc
spec:
  template:
    spec:
      containers:
      - name: downloader
        image: huggingface/transformers-cli:latest
        command:
        - sh
        - -c
        - |
          # Download model from HuggingFace to PVC
          cd /models
          git lfs install
          git clone https://huggingface.co/meta-llama/Llama-2-7b-hf llama2-7b
      volumeMounts:
      - name: model-storage
        mountPath: /models
      volumes:
      - name: model-storage
        persistentVolumeClaim:
          claimName: model-storage-pvc
      restartPolicy: OnFailure
```

## Monitoring PVC Model Status

Check BaseModel status:
```bash
kubectl get basemodel llama2-7b-pvc -o yaml
```

Look for:
- `status.state`: Should be "Ready" when model is verified
- `status.nodesReady`: Nodes where model is available (for RWX PVCs)
- `spec.modelType`, `spec.modelArchitecture`: Auto-populated metadata

## Best Practices

1. **Use RWX PVCs for shared models** that multiple services will use
2. **Pre-validate model files** ensure config.json exists before creating BaseModel
3. **Set resource limits** on metadata extraction jobs if using large PVCs
4. **Use subpaths** to organize multiple models in a single PVC
5. **Monitor PVC usage** to ensure sufficient storage capacity

## Limitations

1. PVCs must be in the same namespace as BaseModel (or cluster namespace for ClusterBaseModel)
2. Cross-namespace PVC access is not supported
3. Model files must be readable by the OME service accounts
4. Metadata extraction requires `config.json` in standard format

## Migration from Other Storage Types

To migrate existing models to PVC storage:

1. Copy model files to PVC maintaining directory structure
2. Ensure config.json is present
3. Update BaseModel storageUri to use `pvc://` format
4. Delete old model files from nodes if using host storage
```

### Troubleshooting Guide: `docs/troubleshooting/pvc-storage.md`

The troubleshooting guide should include:

**Common Issues**:

1. **Model Status Stuck in "MetadataPending"**
   - Symptoms and possible causes
   - Diagnostic commands
   - Solutions for each cause

2. **InferenceService Pod Fails to Start**
   - PVC mounting issues
   - Access mode conflicts
   - Node affinity problems

3. **"PVC Not Found" Errors**
   - Namespace mismatches
   - RBAC permission issues
   - URI typos

4. **Metadata Extraction Job Failures**
   - Missing config.json
   - JSON parsing errors
   - File permission issues

5. **Performance Issues**
   - Storage latency
   - I/O bottlenecks
   - Optimization strategies

**Debugging Tools**:
- kubectl commands for diagnostics
- Log analysis techniques
- Test pod creation for verification
- Job monitoring commands

**Error Reference Table**:
- Common error messages
- Root causes
- Quick solutions

**Advanced Debugging**:
- Creating debug pods
- Accessing PVC contents
- Verifying file permissions
- RBAC troubleshooting

### Architecture Documentation: `docs/architecture/pvc-storage-flow.md`

The architecture documentation should include:

**Overview**:
- Purpose of PVC storage support
- High-level architecture description
- Benefits over existing storage methods

**Architecture Diagram**:
- Sequence diagram showing the complete flow
- Component interactions
- Data flow between components

**Component Details**:

1. **Model Agent**:
   - Role in PVC verification
   - Why it doesn't mount PVCs
   - Status updates it performs

2. **BaseModel Controller**:
   - PVC-specific reconciliation logic
   - Job creation and monitoring
   - Status management

3. **Metadata Extraction Job**:
   - Purpose and lifecycle
   - Security context
   - Resource limits and timeouts

4. **InferenceService Controller**:
   - PVC volume handling
   - Scheduling considerations
   - Subpath support

**Design Decisions**:
- Why jobs for metadata extraction
- Why no PVC mounting in model agent
- Node selector implications
- Security boundaries

**Storage Type Comparison**:
- Table comparing all storage types
- Trade-offs and use cases
- Performance characteristics

**Security Model**:
- RBAC permissions required
- PVC access patterns
- Isolation boundaries

**Performance Analysis**:
- One-time vs recurring costs
- Caching behavior
- Scaling considerations

**Future Enhancements**:
- Cross-namespace support
- Metadata caching
- Dynamic provisioning
- Advanced scheduling

### README Update

Update the main README.md to include:

**In Supported Storage Types section**:
- Add PVC to the list of supported storage backends
- Brief description of PVC storage benefits
- Link to detailed documentation

**PVC Storage Summary**:
- Key features and benefits
- Simple example showing BaseModel with PVC URI
- Reference to the user guide
- Version information (when feature was added)

## Documentation Standards

1. **Clear Examples**: Every concept should have a working example
2. **Troubleshooting**: Common errors and solutions
3. **Diagrams**: Visual representation of complex flows
4. **Cross-references**: Link between related topics
5. **Version Notes**: Indicate when features were added

## Acceptance Criteria
- [ ] User guide covers all PVC storage scenarios
- [ ] Troubleshooting guide addresses common issues
- [ ] Architecture documentation explains design decisions
- [ ] API documentation is updated
- [ ] Examples are tested and working
- [ ] Documentation is reviewed by team

## Dependencies
- Implementation tasks completed
- E2E tests for validating examples

## Estimated Effort
4-5 hours

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Task 8: Documentation #168

Task 8: Documentation

Overview

Scope

Files to Create/Modify

Documentation Content

User Guide: `docs/user-guide/storage/pvc-storage.md`

Manual Metadata Specification

PVC Access Modes

ReadWriteMany (RWX)

ReadWriteOnce (RWO)

ReadOnlyMany (ROX)

Using PVC Models in InferenceService

ClusterBaseModel with PVC

Preparing PVCs for Model Storage

Model Directory Structure

Example: Populating a PVC

Monitoring PVC Model Status

Best Practices

Limitations

Migration from Other Storage Types

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Task 8: Documentation #168

Description

Task 8: Documentation

Overview

Scope

Files to Create/Modify

Documentation Content

User Guide: docs/user-guide/storage/pvc-storage.md

Manual Metadata Specification

PVC Access Modes

ReadWriteMany (RWX)

ReadWriteOnce (RWO)

ReadOnlyMany (ROX)

Using PVC Models in InferenceService

ClusterBaseModel with PVC

Preparing PVCs for Model Storage

Model Directory Structure

Example: Populating a PVC

Monitoring PVC Model Status

Best Practices

Limitations

Migration from Other Storage Types

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

User Guide: `docs/user-guide/storage/pvc-storage.md`