Skip to content

Adapters for Model explorer to visualize egraph and rvsdg#7

Merged
seibert merged 5 commits into
numba:mainfrom
sklam:enh/model_explorer
Apr 30, 2025
Merged

Adapters for Model explorer to visualize egraph and rvsdg#7
seibert merged 5 commits into
numba:mainfrom
sklam:enh/model_explorer

Conversation

@sklam
Copy link
Copy Markdown
Member

@sklam sklam commented Apr 28, 2025

@sklam sklam marked this pull request as ready for review April 29, 2025 17:58
@seibert
Copy link
Copy Markdown
Contributor

seibert commented Apr 30, 2025

For some reason when I try to use this, I see that the model-explorer loads 10 adapters (the default 8 + the 2 newly registered adapters), but the new adapters appear as duplicates of the MLIR adapter:

Loaded 10 extensions:
 - TFLite adapter (Flatbuffer)
 - TFLite adapter (MLIR)
 - TF adapter (MLIR)
 - TF adapter (direct)
 - GraphDef adapter
 - Pytorch adapter (exported program)
 - MLIR adapter
 - MLIR adapter
 - MLIR adapter
 - JSON adapter

I'm currently trying to trace where this is happening.

@seibert
Copy link
Copy Markdown
Contributor

seibert commented Apr 30, 2025

To clarify, I see the problem when I try to load the visualizer using the Python API. If I load it from the command line, I can specify the --extensions argument and everything looks fine.

@seibert
Copy link
Copy Markdown
Contributor

seibert commented Apr 30, 2025

OK, I think I finally figured out the root cause, which I'll brain dump here for posterity.

The extension mechanism of Module Explorer ("adapters" are a type of extension) relies on a temporary metaclass side effect in order to discover extension classes. In order to work correctly, the system requires that:

  1. Extension classes to be a direct subclass of Extension or Adapter (Adapter is a subclass of Extension)
  2. Only one extension class may be defined per Python module.
  3. The module which defines the extension class must not be imported into the interpreter prior to calling the load_extensions() method on the ExtensionManager singleton.

The third condition is what was failing for me when testing this PR. I haven't chased down the sequence that causes the rvsdg and egraph adapter Python modules to be imported early, but that's what is happening. The result is that those adapters fail to be registered properly (in fact, the code erroneously re-registers the last successful adapter 2 more times instead).

Now you might wonder why does this happen? The reason is that the Python extension discovery mechanism in Model Explorer is very "unusual":

  • As list of additional extensions (beyond the built-in ones) can be passed in when calling the model_explorer.visualize() function as a list of strings, each a fully-qualified module name.
  • This list of extension module paths is passed to the Module Explorer server instance, and ultimately to the ExtensionManager singleton class created when the server starts.
  • When the ExtensionManager is asked to load all the extensions, it steps through a list of all the built-in extensions modules, and the user-provided additional extensions.
  • For each one, it tries to import the module with importlib and checks for an exception.
    • As the module is imported, the creation of the extension class in the interpreter triggers code in the ExtensionClassProcessor metaclass attached to the base Extension class.
    • This code captures the derived Extension class object (and some metadata) and saves it into a class variable on the metaclass itself.
    • These captured values persist in the metaclass until they are overwritten by the next imported module containing an extension.
  • Back in the ExtensionManager code, if no import exception occurs, the extension loader checks this class variable in the ExtensionClassProcessor metaclass to see what extension class was defined in the module and saves it into a cache that will be used later.

As might be obvious, this approach has a lot of fundamental problems. The indirection is really confusing, and the mechanism is extremely fragile. It breaks if there are multiple classes in one module or a module is imported early. Some of this needs to be fixed in Model Explorer itself, but I'll see if we can workaround this in this PR somehow.

@seibert
Copy link
Copy Markdown
Contributor

seibert commented Apr 30, 2025

OK, I believe the reason I tripped over this issue in the first place is that my test notebook imported code from the unit tests, which in turn imported the extension main modules directly. I'm going to merge this as-is, and we can try to upstream some fixes to model explorer.

@seibert seibert merged commit 9887efd into numba:main Apr 30, 2025
1 check passed
@sklam sklam deleted the enh/model_explorer branch June 13, 2025 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants