diff --git a/neural-networks/generic-example/AGENTS.md b/neural-networks/generic-example/AGENTS.md index af1af7792..dfac644eb 100644 --- a/neural-networks/generic-example/AGENTS.md +++ b/neural-networks/generic-example/AGENTS.md @@ -25,7 +25,8 @@ Reusable single-model inference scaffold. It runs one Model Zoo model with one i - `Default RVC2 model:` [depthai_models/yolov6_nano_r2_coco.RVC2.yaml](depthai_models/yolov6_nano_r2_coco.RVC2.yaml) - `Default RVC4 model:` [depthai_models/yolov6_nano_r2_coco.RVC4.yaml](depthai_models/yolov6_nano_r2_coco.RVC4.yaml) - `Standalone config:` [oakapp.toml](oakapp.toml) -- `Default model slug:` `luxonis/yolov6-nano:r2-coco-512x288` +- `Default model identifier:` `luxonis/yolov6-nano:r2-coco-512x288` +- `Accepted model inputs:` Model Zoo/HubAI model identifier, local model descriptor/YAML, or local `.tar.xz` `NNArchive` - `Input:` camera by default, or media file via `--media_path` - `Output:` `Video` passthrough plus parsed output on `Detections` @@ -33,13 +34,14 @@ Reusable single-model inference scaffold. It runs one Model Zoo model with one i - [main.py](main.py): model selection, pipeline construction, parser node, and Visualizer topics - [utils/input.py](utils/input.py): camera versus `ReplayVideo` input selection and platform-specific frame type -- [utils/arguments.py](utils/arguments.py): model slug, FPS limit, media path, API key, and overlay options +- [utils/arguments.py](utils/arguments.py): model source, FPS limit, media path, API key, and overlay options - [oakapp.toml](oakapp.toml): standalone entrypoint and packaged defaults ## Architecture - [main.py](main.py) connects to a device, reads the platform string, and chooses `yolov6_nano_r2_coco..yaml` as the default descriptor. -- If `--model` differs from the YAML-backed default, [main.py](main.py) creates `dai.NNModelDescription(args.model, platform=platform)` instead. +- If `--model` ends with `.tar.xz`, [main.py](main.py) loads it directly as an `NNArchive`. +- Otherwise, if `--model` differs from the YAML-backed default model identifier, [main.py](main.py) creates `dai.NNModelDescription(args.model, platform=platform)` and resolves it through the Model Zoo. - [utils/input.py](utils/input.py) returns either a camera node or a `ReplayVideo` node. - `ParsingNeuralNetwork` runs the model and emits parsed output. - `Video` shows `ParsingNeuralNetwork.passthrough` unless `--overlay_mode` is enabled. @@ -55,13 +57,13 @@ Reusable single-model inference scaffold. It runs one Model Zoo model with one i ## Modification Guide -- `Safe to change:` model slug, FPS limit, media path, private API key handling, Visualizer topic names +- `Safe to change:` model source, FPS limit, media path, private API key handling, Visualizer topic names - `Requires care:` model input/output compatibility, overlay assumptions, platform-specific media frame types, standalone defaults in [oakapp.toml](oakapp.toml) - `Likely to break if changed blindly:` using multi-input or multi-head models, assuming every parsed output is a detection, enabling overlay for non-image-like outputs ## Common Adaptations -- `Swap the model:` pass `--model` first; edit YAMLs only if changing the default packaged baseline. +- `Swap the model:` pass `--model` first; this can be a Model Zoo/HubAI model identifier, a local YAML/descriptor, or a local `.tar.xz` archive. Edit YAMLs only if changing the default packaged baseline. - `Run on media:` pass `--media_path`; [utils/input.py](utils/input.py) switches from `Camera` to `ReplayVideo`. - `Use a private model:` set `--api_key` or `DEPTHAI_HUB_API_KEY`. - `Create a task-specific example:` keep [main.py](main.py), [utils/input.py](utils/input.py), and [utils/arguments.py](utils/arguments.py), then replace parser/output handling as needed. @@ -85,5 +87,6 @@ Reusable single-model inference scaffold. It runs one Model Zoo model with one i - `Run:` `python3 main.py` - `Alternative run:` `python3 main.py --model luxonis/mediapipe-selfie-segmentation:256x144 --overlay_mode` +- `Archive run:` `python3 main.py --model /path/to/custom-model.tar.xz` - `Success looks like:` Visualizer exposes `Video` and `Detections`, and the pipeline runs until `q` is pressed -- `Common failure meaning:` model slug unavailable for platform, private model auth missing, or selected model violates the single-input/single-output assumptions +- `Common failure meaning:` model identifier unavailable for platform, private model auth missing, or selected model violates the single-input/single-output assumptions diff --git a/neural-networks/generic-example/README.md b/neural-networks/generic-example/README.md index 2e3e761a8..1d64a3fd4 100644 --- a/neural-networks/generic-example/README.md +++ b/neural-networks/generic-example/README.md @@ -1,7 +1,7 @@ # Generic Example We provide here an example for running inference with a **single model** on a **single-image input** with a **single-head output**. -The example is generic and can be used for various single-image input models from the [Model ZOO](https://models.luxonis.com). +The example is generic and can be used for various single-image input models from the [Model ZOO](https://models.luxonis.com) or HubAI, a model descriptor/YAML, or a local `.tar.xz` `NNArchive`. ## Usage @@ -13,7 +13,7 @@ Here is a list of all available parameters: ``` -m MODEL, --model MODEL - HubAI model reference. (default: luxonis/yolov6-nano:r2-coco-512x288) + Model Zoo/HubAI model identifier, model YAML/descriptor, or local .tar.xz NN archive. (default: luxonis/yolov6-nano:r2-coco-512x288) -d DEVICE, --device DEVICE Optional name, DeviceID or IP of the camera to connect to. (default: None) -fps FPS_LIMIT, --fps_limit FPS_LIMIT @@ -21,7 +21,7 @@ Here is a list of all available parameters: -media MEDIA_PATH, --media_path MEDIA_PATH Path to the media file you aim to run the model on. If not set, the model will run on the camera input. (default: None) -api API_KEY, --api_key API_KEY - HubAI API key to access private model. Can also use 'DEPTHAI_HUB_API_KEY' environment variable instead. (default: ) + HubAI API key for private HubAI access. Can also use 'DEPTHAI_HUB_API_KEY' environment variable instead. (default: ) -overlay OVERLAY_MODE, --overlay_mode If passed, overlays model output on the input image when the output is an array (e.g., depth maps, segmentation maps). Otherwise, displays outputs separately. ``` @@ -67,6 +67,13 @@ python3 main.py \ And this will run an instance segmentation model. +```bash +python3 main.py \ + --model /path/to/custom-model.tar.xz +``` + +This will run a local `NNArchive` directly instead of fetching a model from the Model Zoo. + ## Standalone Mode (RVC4 only) Running the example in the standalone mode, app runs entirely on the device. diff --git a/neural-networks/generic-example/main.py b/neural-networks/generic-example/main.py index 36ea4e3df..733bc0ecb 100644 --- a/neural-networks/generic-example/main.py +++ b/neural-networks/generic-example/main.py @@ -23,10 +23,17 @@ print("Creating pipeline...") # model - model_description = dai.NNModelDescription(f"yolov6_nano_r2_coco.{platform}.yaml") - if model_description.model != args.model: - model_description = dai.NNModelDescription(args.model, platform=platform) - nn_archive = dai.NNArchive(dai.getModelFromZoo(model_description)) + default_description = dai.NNModelDescription(f"yolov6_nano_r2_coco.{platform}.yaml") + + if args.model.endswith(".tar.xz"): + nn_archive = dai.NNArchive(args.model) + else: + model_description = ( + default_description + if default_description.model == args.model + else dai.NNModelDescription(args.model, platform=platform) + ) + nn_archive = dai.NNArchive(dai.getModelFromZoo(model_description)) # media/camera input input_node = create_input_node( diff --git a/neural-networks/generic-example/utils/arguments.py b/neural-networks/generic-example/utils/arguments.py index 5a5764e89..cd412a9a9 100644 --- a/neural-networks/generic-example/utils/arguments.py +++ b/neural-networks/generic-example/utils/arguments.py @@ -7,16 +7,16 @@ def initialize_argparser(): formatter_class=argparse.ArgumentDefaultsHelpFormatter ) parser.description = ( - "General example script to run any model available in HubAI on DepthAI device. \ - All you need is a model slug of the model and the script will download the model from HubAI and create \ - the whole pipeline with visualizations. You also need a DepthAI device connected to your computer. \ + "General example script to run a single-model DepthAI pipeline from a Model Zoo/HubAI model identifier, \ + a model descriptor/YAML, or a local .tar.xz NN archive. The script creates the pipeline and visualizations \ + for a connected OAK device. \ If using OAK-D Lite, please set the FPS limit to 28." ) parser.add_argument( "-m", "--model", - help="HubAI model reference.", + help="Model Zoo/HubAI model identifier, model YAML/descriptor, or local .tar.xz NN archive.", default="luxonis/yolov6-nano:r2-coco-512x288", type=str, ) @@ -51,7 +51,7 @@ def initialize_argparser(): parser.add_argument( "-api", "--api_key", - help="HubAI API key to access private model. Can also use 'DEPTHAI_HUB_API_KEY' environment variable instead.", + help="HubAI API key for private Model Zoo access. Can also use 'DEPTHAI_HUB_API_KEY' environment variable instead.", required=False, default="", type=str,