Skip to content

Analysis and documentation of tool name normalization - no collisions found#431

Closed
Copilot wants to merge 4 commits intomainfrom
copilot/fix-name-collision-normalization
Closed

Analysis and documentation of tool name normalization - no collisions found#431
Copilot wants to merge 4 commits intomainfrom
copilot/fix-name-collision-normalization

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Oct 22, 2025

Problem

Issue #57 raised concerns that tool name normalization in component2json could cause different WebAssembly component function identifiers to collide when converting to MCP tool names, specifically that example:foo and example/foo might both normalize to example_foo.

Analysis

After thorough investigation using valid WIT (WebAssembly Interface Type) files and the wasm-tools component command, I determined that the collision concern is theoretical and does not occur in practice with valid WIT component names.

Key Findings

Per the WIT specification:

  1. Valid WIT identifiers are strictly constrained:

    • Package names: namespace:package format (e.g., wasi:http)
    • Interface names: kebab-case only (e.g., types, my-interface)
    • Fully qualified: namespace:package/interface (e.g., wasi:http/types)
    • Labels can only contain: [a-z0-9-]
    • No underscores or dots in valid identifiers
  2. Current normalization preserves hyphens:

    • The normalization converts : and / to _
    • Hyphens - are preserved, which are the main differentiator in valid WIT names
    • Different valid WIT structures produce different normalized names
  3. Tested collision scenarios:

    foo:bar/baz     → foo_bar_baz
    foo-bar:baz     → foo-bar_baz     ✅ Different!
    wasi:io/streams → wasi_io_streams
    wasi-io:streams → wasi-io_streams ✅ Different!
    

Why No Collisions Occur

  • Wasmtime only provides valid Component Model names that follow WIT spec
  • Invalid names (with underscores in wrong places) are rejected during component creation
  • The preserved hyphens ensure different valid structures don't collide

Changes Made

  • Added test_no_collision_with_valid_wit_names test demonstrating no collisions occur with valid WIT component names
  • Created comprehensive documentation in docs/design/tool-name-normalization.md explaining:
    • WIT specification constraints
    • Collision analysis with test cases
    • Why the theoretical concern doesn't apply to real components
  • Removed incorrect CHANGELOG entry

Testing

  • ✅ New test validates that different valid WIT structures produce different normalized names
  • ✅ All existing component2json tests pass (41/43, with 2 pre-existing unrelated failures)
  • ✅ TypeScript compilation and linting pass

Conclusion

The current normalization strategy is correct and sufficient for all valid WIT component names. Issue #57 describes a theoretical concern that doesn't apply to real-world usage, as collisions only occur when comparing valid names to invalid strings that wasmtime would never provide.

Addresses #57

Original prompt

This section details on the original issue you should resolve

<issue_title>Tool name normalization could lead to potential name collisions</issue_title>
<issue_description>#57 introduces normalization of Component function names to Tool names. It basically converts {package-name}/{interface-name}.{func-name} from component to MCP Tool name with a schema of ^[a-zA-Z0-9-_]. This normalization strategy can cause different original function names to map to the same normalized name. For example, example:foo and example/foo both normalize to example_foo.</issue_description>

<agent_instructions>You should first think about test cases where you can trigger the name collisions in the component2json name normalization method, and think of ways to address it.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes #61


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits October 22, 2025 04:55
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix potential name collisions in tool name normalization Fix tool name normalization to prevent collisions from different delimiters Oct 22, 2025
Copilot AI requested a review from Mossaka October 22, 2025 05:03
@Mossaka
Copy link
Copy Markdown
Collaborator

Mossaka commented Oct 22, 2025

@copilot Does your test cases really make sense as valid Wasm Component and WIT IDL formats? Read @WebAssembly/component-model/files/design/mvp/WIT.md and @WebAssembly/component-model/files/design/mvp/CanonicalABI.md

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Oct 22, 2025

@copilot Does your test cases really make sense as valid Wasm Component and WIT IDL formats? Read @WebAssembly/component-model/files/design/mvp/WIT.md and @WebAssembly/component-model/files/design/mvp...

You're absolutely right - my test cases don't reflect valid WIT IDL formats. After reviewing the WIT specification, I understand that:

  1. Package names follow namespace:package format (e.g., wasi:http)
  2. Interface names are kebab-case identifiers (e.g., types, filesystem)
  3. Fully qualified names use / as the separator: namespace:package/interface

The test cases I created with "example/foo" and "example.foo" as package names, and "interface:name" as interface names are invalid according to the WIT spec.

Looking at the actual component exports in existing tests (e.g., "foo:foo/foo"), the only special characters that appear in valid WIT are:

  • : separates namespace from package
  • / separates package from interface
  • - for kebab-case within identifiers

The collision scenario described in issue #57 appears to be theoretical rather than based on real WIT structures. Should I revert these changes and close the issue as "won't fix", or would you like me to create a more targeted fix that only handles the actual valid WIT format?

@Mossaka
Copy link
Copy Markdown
Collaborator

Mossaka commented Oct 22, 2025

@copilot Yes i would like you to create a more targeted test cases that try to find tool name collisions by constructing actual valid WIT files (but more complicated ones). wasm-tools component command is your friend. Report any issues you found with WITs and any docs that need to be updated for other agent's knowledge

After thorough analysis with valid WIT files, determined that:
- Valid WIT names cannot collide with current normalization
- Hyphens are preserved, preventing collisions
- Issue #57 was theoretical, not practical
- Added test demonstrating no collisions with valid names
- Added comprehensive documentation in docs/design/

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Oct 22, 2025

@copilot Yes i would like you to create a more targeted test cases that try to find tool name collisions by constructing actual valid WIT files (but more complicated ones). wasm-tools component comm...

I've completed a thorough analysis using valid WIT files and the wasm-tools component command. Here are the findings:

Result: No collisions occur with valid WIT names.

The key insight is that valid WIT identifiers can only contain [a-z0-9-], and our normalization preserves hyphens. This means different valid WIT structures produce different normalized names:

  • foo:bar/bazfoo_bar_baz
  • foo-bar:bazfoo-bar_baz ✅ Different
  • wasi:io/streamswasi_io_streams
  • wasi-io:streamswasi-io_streams ✅ Different

Collisions only occur when comparing a valid name (e.g., wasi:http) to an invalid string (e.g., wasi_http), but wasmtime only provides valid Component Model names.

Changes in commit 4ba2130:

  • Reverted the unnecessary -c-, -s-, -d- encoding
  • Added test_no_collision_with_valid_wit_names with documented test cases
  • Created docs/design/tool-name-normalization.md explaining the analysis for future reference

Issue #57 appears to be based on a theoretical concern that doesn't apply to real WIT components.

Copilot AI changed the title Fix tool name normalization to prevent collisions from different delimiters Analysis and documentation of tool name normalization - no collisions found Oct 22, 2025
@Mossaka Mossaka closed this Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool name normalization could lead to potential name collisions

2 participants