Merge branch 'main' of github.com:ndif-team/nnsight into main

JadenFiotto-Kaufman · JadenFiotto-Kaufman · commit 4f2bdbe8aacb · 2026-01-14T13:56:01.000-05:00
diff --git a/README.md b/README.md
@@ -30,12 +30,70 @@ Originally developed in the [NDIF team](https://ndif.us/) at Northeastern Univer
 
 > 📖 For a deeper technical understanding of nnsight's internals (tracing, interleaving, the Envoy system, etc.), see **[NNsight.md](./NNsight.md)**.
 
+---
+
 ## Installation
 
 ```bash
 pip install nnsight
 ```
 
+---
+
+## Agents
+
+Inform LLM agents how to use nnsight using one of these methods:
+
+### Skills Repository
+
+**Claude Code**
+
+```bash
+# Open Claude Code terminal
+claude
+
+# Add the marketplace (one time)
+/plugin marketplace add https://github.com/ndif-team/skills.git
+
+# Install all skills
+/plugin install nnsight@skills
+```
+
+**OpenAI Codex**
+
+```bash
+# Open OpenAI Codex terminal
+codex
+
+# Install skills
+skill-installer install https://github.com/ndif-team/skills.git
+```
+
+### Context7 MCP
+
+Alternatively, use [Context7](https://github.com/upstash/context7) to provide up-to-date nnsight documentation directly to your LLM. Add `use context7` to your prompts or configure it in your MCP client:
+
+```json
+{
+  "mcpServers": {
+    "context7": {
+      "url": "https://mcp.context7.com/mcp"
+    }
+  }
+}
+```
+
+See the [Context7 README](https://github.com/upstash/context7/blob/master/README.md) for full installation instructions across different IDEs.
+
+### Documentation Files
+
+You can also add our documentation files directly to your agent's context:
+
+- **[llms.md](./llms.md)** — Comprehensive guide for AI agents working with nnsight
+- **[NNsight.md](./NNsight.md)** — Deep technical documentation on nnsight's internals
+
+---
+
 ## Quick Start
 
 ```python
@@ -58,8 +116,6 @@ print(model.tokenizer.decode(output.logits.argmax(dim=-1)[0]))
 
 > **💡 Tip:** Always call `.save()` on values you want to access after the trace exits. Without `.save()`, values are garbage collected. You can also use `nnsight.save(value)` as an alternative.
 
----
-
 ## Accessing Activations
 
 ```python
@@ -79,8 +135,6 @@ with model.trace("The Eiffel Tower is in the city of"):
 
 **Note:** GPT-2 transformer layers return tuples where index 0 contains the hidden states.
 
----
-
 ## Modifying Activations
 
 ### In-Place Modification
@@ -106,8 +160,6 @@ with model.trace("Hello"):
     result = model.transformer.h[-1].mlp.output.save()
 ```
 
----
-
 ## Batching with Invokers
 
 Process multiple inputs in one forward pass. Each invoke runs its code in a **separate worker thread**:
@@ -148,7 +200,6 @@ with model.trace() as tracer:
         out_all = model.lm_head.output[:, -1].save()  # Shape: [3, vocab]
 ```
 
----
 
 ## Multi-Token Generation
 
@@ -198,7 +249,6 @@ with model.generate("Hello", max_new_tokens=5) as tracer:
 >         final = model.output.save()  # Now works!
 > ```
 
----
 
 ## Gradients
 
@@ -218,7 +268,6 @@ with model.trace("Hello"):
 print(grad.shape)
 ```
 
----
 
 ## Model Editing
 
@@ -241,7 +290,6 @@ assert not torch.all(out1 == 0)
 assert torch.all(out2 == 0)
 ```
 
----
 
 ## Scanning (Shape Inference)
 
@@ -254,7 +302,6 @@ with model.scan("Hello"):
 print(dim)  # 768
 ```
 
----
 
 ## Caching Activations
 
@@ -269,7 +316,6 @@ layer0_out = cache['model.transformer.h.0'].output
 print(cache.model.transformer.h[0].output[0].shape)
 ```
 
----
 
 ## Sessions
 
@@ -285,7 +331,6 @@ with model.session() as session:
         hs2 = model.transformer.h[0].output[0].save()
 ```
 
----
 
 ## Remote Execution (NDIF)
 
@@ -303,7 +348,6 @@ with model.trace("Hello", remote=True):
 
 Check available models at [nnsight.net/status](https://nnsight.net/status/)
 
----
 
 ## vLLM Integration
 
@@ -321,7 +365,6 @@ with model.trace("Hello", temperature=0.0, max_tokens=5) as tracer:
         logits.append(model.logits.output)
 ```
 
----
 
 ## NNsight for Any PyTorch Model
 
@@ -343,8 +386,6 @@ with model.trace(torch.rand(1, 5)):
     output = model.output.save()
 ```
 
----
-
 ## Source Tracing
 
 Access intermediate operations inside a module's forward pass. `.source` rewrites the forward method to hook into all operations:
@@ -361,8 +402,6 @@ with model.trace("Hello"):
     attn_out = model.transformer.h[0].attn.source.attention_interface_0.output.save()
 ```
 
----
-
 ## Ad-hoc Module Application
 
 Apply modules out of their normal execution order:
@@ -459,7 +498,7 @@ For more debugging tips, see the [documentation](https://www.nnsight.net).
 
 - **[Documentation](https://www.nnsight.net)** — Tutorials, guides, and API reference
 - **[NNsight.md](./NNsight.md)** — Deep technical documentation on nnsight
-- **[CLAUDE.md](./CLAUDE.md)** — Comprehensive guide for AI agents working with nnsight
+- **[llms.md](./llms.md)** — Comprehensive guide for AI agents working with nnsight
 
 ---
 
diff --git a/llms.md b/llms.md
@@ -1,4 +1,4 @@
-# CLAUDE.md - NNsight AI Agent Guide
+# llms.md - NNsight AI Agent Guide
 
 This document provides comprehensive guidance for AI agents working with the `nnsight` library. NNsight enables interpreting and manipulating the internals states of deep learning models through a deferred execution tracing system.
 
diff --git a/src/nnsight/intervention/backends/remote.py b/src/nnsight/intervention/backends/remote.py
@@ -71,6 +71,7 @@ class Icons:
     def __init__(self, enabled: bool = True, verbose: bool = False):
         self.enabled = enabled
         self.verbose = verbose
+        self.job_start_time: Optional[float] = None
         self.status_start_time: Optional[float] = None
         self.spinner_idx = 0
         self.last_response: Optional[Tuple[str, str, str]] = (
@@ -79,11 +80,11 @@ def __init__(self, enabled: bool = True, verbose: bool = False):
         self._line_written = False
         self._display_handle = None
 
-    def _format_elapsed(self) -> str:
-        """Format elapsed time in current status."""
-        if self.status_start_time is None:
+    def _format_time(self, start_time: Optional[float]) -> str:
+        """Format elapsed time from a given start time."""
+        if start_time is None:
             return "0.0s"
-        elapsed = time.time() - self.status_start_time
+        elapsed = time.time() - start_time
         if elapsed < 60:
             return f"{elapsed:.1f}s"
         elif elapsed < 3600:
@@ -95,6 +96,14 @@ def _format_elapsed(self) -> str:
             mins = int((elapsed % 3600) // 60)
             return f"{hours}h {mins}m"
 
+    def _format_elapsed(self) -> str:
+        """Format elapsed time in current status."""
+        return self._format_time(self.status_start_time)
+
+    def _format_total(self) -> str:
+        """Format total elapsed time since job started."""
+        return self._format_time(self.job_start_time)
+
     def _get_status_style(self, status_name: str) -> tuple:
         """Get icon and color for a status."""
         status_map = {
@@ -128,55 +137,77 @@ def update(self, job_id: str = "", status_name: str = "", description: str = "")
         if not job_id:
             return
 
+        is_log = status_name == "LOG"
+
         last_status = self.last_response[1] if self.last_response else None
-        status_changed = status_name != last_status
+        # LOG status should not be considered a status change for timer purposes
+        status_changed = status_name != last_status and not is_log
+
+        # Track job start time (first status received)
+        if last_status is None:
+            self.job_start_time = time.time()
 
-        # Reset timer when status changes
+        # Reset status timer when status changes (but not for LOG)
         if status_changed:
             self.status_start_time = time.time()
 
-        # Store the response
-        self.last_response = (job_id, status_name, description)
+        # Store the response (but not for LOG - so we go back to previous status on refresh)
+        if not is_log:
+            self.last_response = (job_id, status_name, description)
 
         icon, color = self._get_status_style(status_name)
-        elapsed = self._format_elapsed()
 
         # Build the status line
         # Format: ● STATUS (elapsed) [job_id] description
 
         is_terminal = status_name in ("COMPLETED", "ERROR", "NNSIGHT_ERROR")
         is_active = status_name in ("QUEUED", "RUNNING", "DISPATCHED")
 
+        # For terminal states, show total time; for others, show status elapsed time
+        elapsed = self._format_total() if is_terminal else self._format_elapsed()
+
         # For active states, show spinner
         if is_active:
             prefix = f"{self.Colors.DIM}{self._get_spinner()}{self.Colors.RESET}"
         else:
             prefix = f"{color}{icon}{self.Colors.RESET}"
 
         # Build status text - full job ID shown so users can reference it
-        status_text = (
-            f"{prefix} "
-            f"{self.Colors.DIM}[{job_id}]{self.Colors.RESET} "
-            f"{color}{self.Colors.BOLD}{status_name.ljust(10)}{self.Colors.RESET} "
-            f"{self.Colors.DIM}({elapsed}){self.Colors.RESET}"
-        )
+        # LOG status does not show elapsed time
+        if is_log:
+            status_text = (
+                f"{prefix} "
+                f"{self.Colors.DIM}[{job_id}]{self.Colors.RESET} "
+                f"{color}{self.Colors.BOLD}{status_name.ljust(10)}{self.Colors.RESET}"
+            )
+        else:
+            status_text = (
+                f"{prefix} "
+                f"{self.Colors.DIM}[{job_id}]{self.Colors.RESET} "
+                f"{color}{self.Colors.BOLD}{status_name.ljust(10)}{self.Colors.RESET} "
+                f"{self.Colors.DIM}({elapsed}){self.Colors.RESET}"
+            )
 
         if description:
             status_text += f" {self.Colors.DIM}{description}{self.Colors.RESET}"
 
         # Display the status
-        self._display(status_text, status_changed, is_terminal)
+        # LOG status should print a newline so it's not cleared
+        print_newline = is_terminal or is_log
+        self._display(status_text, status_changed, print_newline)
 
         self._line_written = True
 
-    def _display(self, text: str, status_changed: bool, is_terminal: bool):
+    def _display(self, text: str, status_changed: bool, print_newline: bool = False):
         """Display text, handling terminal vs notebook environments."""
         if __IPYTHON__:
-            self._display_notebook(text, status_changed, is_terminal)
+            self._display_notebook(text, status_changed, print_newline)
         else:
-            self._display_terminal(text, status_changed, is_terminal)
+            self._display_terminal(text, status_changed, print_newline)
 
-    def _display_terminal(self, text: str, status_changed: bool, is_terminal: bool):
+    def _display_terminal(
+        self, text: str, status_changed: bool, print_newline: bool = False
+    ):
         """Display in terminal with in-place updates."""
         # In verbose mode, print new line when status changes
         if self.verbose and status_changed and self._line_written:
@@ -187,7 +218,7 @@ def _display_terminal(self, text: str, status_changed: bool, is_terminal: bool):
 
         sys.stdout.write(text)
 
-        if is_terminal:
+        if print_newline:
             sys.stdout.write("\n")
 
         sys.stdout.flush()
@@ -245,7 +276,9 @@ def _ansi_to_html(self, text: str) -> str:
         result.append("</span>" * open_spans)
         return "".join(result)
 
-    def _display_notebook(self, text: str, status_changed: bool, is_terminal: bool):
+    def _display_notebook(
+        self, text: str, status_changed: bool, print_newline: bool = False
+    ):
         """Display in notebook using DisplayHandle for flicker-free updates."""
         from IPython.display import display, HTML
 
@@ -260,14 +293,14 @@ def _display_notebook(self, text: str, status_changed: bool, is_terminal: bool):
         elif self._display_handle is None:
             # First display
             self._display_handle = display(html_content, display_id=True)
+        elif print_newline:
+            # LOG status: create new display so it persists, then reset handle for next status
+            display(html_content)
+            self._display_handle = None
         else:
             # Update existing display in place (no flicker)
             self._display_handle.update(html_content)
 
-        if is_terminal:
-            # Reset for next job
-            self._display_handle = None
-
 
 class RemoteException(Exception):
     pass
@@ -376,9 +409,7 @@ def handle_response(
         self.job_status = response.status
 
         if response.status == ResponseModel.JobStatus.ERROR:
-            self.status_display.update(
-                response.id, response.status.name, response.description or ""
-            )
+            self.status_display.update(response.id, response.status.name, "")
             raise RemoteException(f"{response.description}\nRemote exception.")
 
         # Log response for user (skip STREAM status - it's internal)
diff --git a/src/nnsight/intervention/envoy.py b/src/nnsight/intervention/envoy.py
@@ -673,7 +673,7 @@ def device(self) -> Optional[torch.device]:
         except:
             return None
         
-    property
+    @property
     def devices(self) -> Optional[set[torch.device]]:
         """
         Get the devices the module is on. Finds all parameters and return their devices.
diff --git a/src/nnsight/schema/config.py b/src/nnsight/schema/config.py

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# CLAUDE.md - NNsight AI Agent Guide`
	`1`	`+# llms.md - NNsight AI Agent Guide`
`2`	`2`
`3`	`3`	This document provides comprehensive guidance for AI agents working with the `nnsight` library. NNsight enables interpreting and manipulating the internals states of deep learning models through a deferred execution tracing system.
`4`	`4`