[Improvement] Increase Unit Test Coverage to ~90%

### 📝 Description

As `supervision` matures into an industry-standard utility package for Computer Vision, the reliability and stability of the codebase become paramount. Currently, our total test coverage sits at approximately **50%**, with critical modules like `metrics` and `key_points` falling well below 40%.

To ensure this package remains the go-to solution for production pipelines and research alike, I propose an initiative to systematically increase our unit test coverage to a realistic target of **~90%**.

### 🚀 Motivation

Why is this high priority?

* **Industry Standard Responsibility:** `supervision` is infrastructure. Users rely on it for critical CV workflows. A subtle bug in a metric calculation (like mAP or confusion matrix) can silently invalidate research experiments.
* **Refactoring Confidence:** As we optimize performance, a robust test suite acts as a safety net.
* **Handling Edge Cases:** Computer Vision data is notoriously messy. High coverage ensures we handle empty detections, `None` values, and malformed shapes gracefully.

### 📊 Current State

According to the latest Codecov report, we have significant gaps in core modules.

| Module | Coverage | Status | Notes |
| :--- | :--- | :--- | :--- |
| **`metrics`** | **38.36%** | 🔴 Critical | Over 1,100 lines untested. |
| **`key_points`** | **36.68%** | 🔴 Critical | Lowest percentage in the library. |
| **`utils`** | **55.06%** | 🟡 Low | |
| **`dataset`** | **57.32%** | 🟡 Low | Formats logic needs hardening. |
| **`detection`** | **68.49%** | 🟡 Moderate | High volume of missed lines (639) due to file size. |

### 🛠 Proposed Strategy

We should tackle this modularly. The immediate priority is to lift the "Red" modules out of the danger zone.

**Priority 1: The Criticals**
* **`metrics/`**: This is the highest impact area. We need to verify that mathematical calculations for accuracy, recall, and precision are correct across various edge cases.
* **`key_points/`**: Needs basic functional tests for skeleton handling and visualization.

**Priority 2: Stability Layers**
* **`dataset/`**: Ensure saving/loading functions (YOLO, COCO, Pascal VOC) are robust against file system errors and malformed inputs.
* **`utils/`**: General utility functions that are used everywhere need to be bulletproof.

### ✅ Guidelines for New Tests

For contributors looking to help:
* **Test Behavior, Not Implementation:** Focus on inputs and expected outputs.
* **Edge Cases:** Explicitly test for empty lists, `None` values, and mismatched array shapes.
* **Mocking:** Use mocking for external I/O (downloading files, displaying windows) to keep the suite fast.
* **Parametrization:** Use `@pytest.mark.parametrize` to cover multiple scenarios efficiently.

### 📋 Tracking

Please comment below if you would like to pick up a specific module so we don't duplicate work.

- [ ] `supervision/metrics` (High Priority)
- [ ] `supervision/key_points` (High Priority)
- [ ] `supervision/dataset`
- [ ] `supervision/utils`
- [ ] `supervision/detection`

---

The actual coverage report can be found at https://app.codecov.io/gh/roboflow/supervision/blob/develop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] Increase Unit Test Coverage to ~90% #2108

📝 Description

🚀 Motivation

📊 Current State

🛠 Proposed Strategy

✅ Guidelines for New Tests

📋 Tracking

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Module	Coverage	Status	Notes
`metrics`	38.36%	🔴 Critical	Over 1,100 lines untested.
`key_points`	36.68%	🔴 Critical	Lowest percentage in the library.
`utils`	55.06%	🟡 Low
`dataset`	57.32%	🟡 Low	Formats logic needs hardening.
`detection`	68.49%	🟡 Moderate	High volume of missed lines (639) due to file size.

[Improvement] Increase Unit Test Coverage to ~90% #2108

Description

📝 Description

🚀 Motivation

📊 Current State

🛠 Proposed Strategy

✅ Guidelines for New Tests

📋 Tracking

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions