Skip to content

[FEAT] ZFS Pool Monitoring Support #842

@Starosdev

Description

@Starosdev

Summary

Add ZFS pool monitoring to Scrutiny as a dedicated feature with its own collector, API endpoints, and UI section.

Why

ZFS is widely used for storage, especially in home servers and NAS systems. While Scrutiny monitors individual drive health via SMART, ZFS pools provide an additional layer of storage health that users need to monitor:

  • Pool health status (ONLINE, DEGRADED, FAULTED)
  • Vdev structure and individual device status
  • Capacity and fragmentation metrics
  • Scrub status and error counts
  • Historical trends

This has been a long-requested feature (see #171).

Proposed Implementation

Architecture

Component Description
Collector New collector-zfs binary with its own cron schedule
Backend New API endpoints under /api/zfs/*
Frontend Separate "ZFS Pools" section (not mixed with SMART devices)
Platforms Linux + FreeBSD

Scope

In scope (Pools only):

  • Pool health and status monitoring
  • Vdev tree structure with device status
  • Capacity metrics (size, allocated, free, fragmentation)
  • Scrub status (state, progress, errors)
  • Historical capacity/health trends
  • Archive/mute/label operations

Out of scope (for now):

  • Dataset monitoring
  • Drive-to-vdev mapping with existing SMART data
  • ZFS properties management

API Endpoints

Endpoint Method Description
/api/zfs/pools/register POST Register detected pools
/api/zfs/summary GET Dashboard summary
/api/zfs/pool/:guid/metrics POST Upload pool metrics
/api/zfs/pool/:guid/details GET Pool details with vdev tree
/api/zfs/pool/:guid/archive POST Archive pool
/api/zfs/pool/:guid/mute POST Mute notifications
/api/zfs/pool/:guid/label POST Update label
/api/zfs/pool/:guid DELETE Delete pool

Data Models

ZFS Pool:

  • GUID (primary key), Name, HostID
  • Status (ONLINE/DEGRADED/FAULTED/OFFLINE/UNAVAIL)
  • Size, Allocated, Free, Fragmentation, CapacityPercent
  • Scrub state, progress, errors

ZFS Vdev:

  • Pool GUID (foreign key), Parent ID (for hierarchy)
  • Name, Type (disk/mirror/raidz1/raidz2/raidz3/spare/log/cache)
  • Status, Path, Read/Write/Checksum errors

ZFS Commands Used

# List pools (machine-readable)
zpool list -H -p -o name,guid,size,alloc,free,frag,cap,health

# Detailed status with vdev tree
zpool status -p <poolname>

Acceptance Criteria

  • ZFS collector binary builds for Linux and FreeBSD
  • Collector detects pools and uploads metrics to API
  • Backend stores pool data in SQLite and metrics in InfluxDB
  • Frontend displays ZFS pools dashboard with status cards
  • Frontend displays pool detail view with vdev tree
  • Capacity history chart shows trends
  • Archive/mute/label operations work
  • Notifications trigger for degraded/faulted pools

Technical Notes

  • Pool GUID used as primary identifier (stable across renames)
  • Vdev hierarchy stored with parent references
  • Separate InfluxDB measurement for ZFS metrics (zfs_pool_metrics)
  • Follow existing patterns from SMART device monitoring

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions