Skip to content

v0.5.14

Choose a tag to compare

@JadenFiotto-Kaufman JadenFiotto-Kaufman released this 08 Jan 04:58
· 10 commits to main since this release
7ab8470

NNsight v0.5.14 Release Notes

Release Date: January 2026

This release focuses on improving the remote execution experience, vLLM compatibility, developer documentation, and overall code quality. It includes 59 commits across 37 files, with significant enhancements to the job status display system, vLLM input handling, and comprehensive new documentation.


✨ New Features

Enhanced Remote Job Status Display

The remote execution logging system has been completely redesigned with a new JobStatusDisplay class that provides:

  • Real-time visual feedback with Unicode spinners and status icons
  • ANSI color support with automatic detection for terminals and notebooks
  • In-place status updates that don't flood the console with repeated messages
  • Elapsed time tracking per status phase
  • Seamless Jupyter notebook integration with flicker-free HTML rendering using DisplayHandle
# New visual status display when running remote traces
with model.trace("Hello", remote=True):
    output = model.lm_head.output.save()

# Output now shows:
# ⠋ [job-id] QUEUED     (2.3s)
# ● [job-id] RUNNING    (0.5s) 
# ✓ [job-id] COMPLETED  (1.2s)

vLLM Token Input Compatibility

vLLM now accepts a broader range of input formats, matching the flexibility of LanguageModel:

  • Token ID lists: model.trace([1, 2, 3, 4])
  • HuggingFace tokenizer outputs: model.trace(tokenizer("Hello", return_tensors="pt"))
  • Dictionary with input_ids: model.trace({"input_ids": tensor, "attention_mask": mask})
from nnsight.modeling.vllm import VLLM

model = VLLM("gpt2", dispatch=True)

# Now works with pre-tokenized inputs
tokens = tokenizer("Hello world", return_tensors="pt")
with model.trace(tokens, temperature=0.0):
    logits = model.logits.output.save()

vLLM Auto-Dispatch

vLLM models now automatically dispatch when entering a trace context without dispatch=True, matching the behavior of LanguageModel:

model = VLLM("gpt2")  # No dispatch=True needed

# Automatically dispatches on first trace
with model.trace("Hello"):
    output = model.logits.output.save()

Envoy.devices Property

New property to retrieve all devices a model is distributed across:

model = LanguageModel("meta-llama/Llama-3.1-70B", device_map="auto")
print(model.devices)  # {device(type='cuda', index=0), device(type='cuda', index=1), ...}

Auto-Detect API Key

The API key is now automatically detected from multiple sources in order:

  1. NDIF_API_KEY environment variable
  2. Google Colab userdata (userdata.get("NDIF_API_KEY"))
  3. Saved configuration
# No need to manually set if NDIF_API_KEY is in your environment
import os
os.environ["NDIF_API_KEY"] = "your-key"

# Works automatically
with model.trace("Hello", remote=True):
    output = model.output.save()

vLLM Optional Dependency

vLLM is now available as an optional dependency with a pinned Triton version for stability:

pip install nnsight[vllm]
# Installs: vllm>=0.12, triton==3.5.0

🐛 Bug Fixes

vLLM Auto-Dispatch Fix

Fixed an issue where vLLM would fail when tracing without explicitly setting dispatch=True. The model now auto-dispatches when needed.

NDIF Status Null Revision Handling

Fixed a bug where ndif_status() and is_model_running() would fail when a model's revision was null in the API response. Now properly defaults to "main".

Type Annotations for _prepare_input

Corrected type annotations in LanguageModel._prepare_input() to properly reflect the accepted input types.

Attention Mask Handling

Fixed a bug where attention masks were incorrectly overwritten during batching. The attention mask is now only applied when explicitly provided.

HTTPS Configuration

Simplified API configuration by using full URLs (https://api.ndif.us) instead of separate HOST and SSL fields, reducing potential misconfiguration.

Performance: Automatic Attention Mask Creation

Removed automatic attention mask creation for language models, improving performance by avoiding unnecessary tensor operations when the mask isn't needed.


📚 Documentation

New Comprehensive Documentation Files

Two major documentation files have been added:

  • CLAUDE.md (~1,800 lines): AI agent-focused guide covering all NNsight features with practical examples, common patterns, and gotchas
  • NNsight.md (~3,800 lines): Deep technical documentation covering NNsight's internal architecture including tracing, interleaving, Envoy system, vLLM integration, and remote execution

README Improvements

  • Redesigned header inspired by vLLM style
  • Added out-of-order access warning with troubleshooting table
  • Added tracer.iter[:] footgun warning
  • Fixed documentation examples

Walkthrough Updates

The NNsight_Walkthrough.ipynb has been streamlined and updated for clarity, with a restructured practical focus.


🔧 Internal Changes

Logging System Refactor

  • Removed src/nnsight/log.py (old logging module)
  • Consolidated job status display into RemoteBackend with the new JobStatusDisplay class
  • Better separation of concerns between logging and remote execution

Configuration Refactor

  • ConfigModel.load() now handles environment variable overrides internally
  • Removed deprecated SSL configuration field
  • Host configuration now uses full URL format

NDIF Status Improvements

  • Added proper enums for Status, ModelStatus, and DeploymentType
  • Better error handling with graceful fallbacks
  • Improved docstrings with usage examples

vLLM Updates

  • Updated to use cached_tokenizer_from_config (replacing deprecated init_tokenizer_from_configs)
  • Uses TokensPrompt for token-based inputs
  • Proper pad token handling

Test Suite Improvements

  • New test files:

    • test_debug.py: Comprehensive debugging and exception tests
    • test_remote.py: Remote execution tests for NDIF
    • test_vllm_dispatch_bug.py: Regression test for vLLM auto-dispatch
    • conftest.py: Shared pytest fixtures
    • debug_demo.py, explore_remote.py, explore_remote_advanced.py: Development utilities
  • Test reorganization: Following pytest best practices with shared fixtures and improved organization

  • CI update: Limited pytest to test_lm.py and test_tiny.py for faster CI runs


⚠️ Breaking Changes

Configuration

  • CONFIG.API.SSL has been removed. The CONFIG.API.HOST now includes the full URL with protocol (e.g., https://api.ndif.us)

Remote Logging

  • CONFIG.APP.REMOTE_LOGGING no longer triggers a callback when changed. The new JobStatusDisplay class handles all logging internally based on this setting.

📦 Dependencies

Change Details
New optional vllm>=0.12 (via nnsight[vllm])
New optional triton==3.5.0 (pinned for vLLM stability)

🙏 Contributors

  • @Butanium - vLLM auto-dispatch fix, input compatibility, type annotations
  • NDIF Team - Remote logging refactor, documentation, NDIF status improvements

Upgrade Guide

pip install --upgrade nnsight

# For vLLM support
pip install --upgrade nnsight[vllm]

If you were using CONFIG.API.SSL:

# Before (v0.5.13)
CONFIG.API.HOST = "api.ndif.us"
CONFIG.API.SSL = True

# After (v0.5.14)
CONFIG.API.HOST = "https://api.ndif.us"

Full Changelog: v0.5.13...v0.5.14