Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Strings] Make string a subtype of ext, not any #7373

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

kripken
Copy link
Member

@kripken kripken commented Mar 13, 2025

StringLowering converts strings to externs, which makes sense as we lower
stringrefs to imported JS strings. For the reverse transform it is convenient
to just have strings be subtypes of ext, see #7370 - that makes it simple to
switch stringref to externref and vice versa.

@kripken kripken requested a review from tlively March 13, 2025 23:51
Comment on lines 449 to 451
// Bottom types already handled (including string as the switch value,
// which implies the other value is bottom, which again would have been
// handled).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the addition to this comment. string isn't a bottom type, so why is it included in the bottom types already having been handled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 399-400 swap to get the lower type as a. If a is string, then b is higher than a, which means (since string is the last of the non-bottom types in the list) that b is a bottom type. But we already handled bottom types before on 392-396.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I don't get that from reading the comment. Is there something special about string here that doesn't also apply to other types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that strings are at the end (right before all the bottom types). I can add that to the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment improved.

@@ -949,8 +956,9 @@ std::optional<HeapType> HeapType::getSuperType() const {
case none:
case exn:
case noexn:
case string:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! This doesn't look like it was correct before, since it should have returned any. Do you know how this wasn't caught by tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test was just wrong, but it agreed with the code 😄

See line 1683 in the gtest modified in this PR.

Comment on lines -491 to -493
(any.convert_extern
(extern.convert_any
(string.const "string")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might still be useful to test any.convert_extern of a string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow - convert it from what to what?

In the new model a string is an externref like a JS object is an externref: it doesn't have a wasm representation under any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any.convert_extern and extern.convert_any are infallible; all any and extern values, including strings, can be converted to the other representation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is the "other representation" of string? I.e. what are you saying that any.convert_extern of a string should return - what actual type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an anyref. There's no more specific type and trying to cast it to anything else will fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope :) A good way to think about it is that the converted value has a hidden subtype of anyref. any is the most specific heap type we can name to describe the value, but it still isn't exactly any.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you are saying that Binaryen should create a value with that type. That means that Binaryen can see a value with that type. And that means that if Binaryen sees a ref.cast to (exact anyref) it should succeed on that value...?

Or are you saying Binaryen should internally create a subtype of anyref to avoid that issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the easiest fix in the interpreter will be to update the interpretation of casts to have them always fail if the cast target type is an exact reference to an abstract heap type. That seems simpler than actually realizing a hidden subtype of any in our type system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might work in the interpreter but I think it is dangerous in the optimizer as well. Seems safer to just add an anystring type (or keep the existing one as that).

How about leaving this for later? We don't urgently need this functionality (no known codebase converts string to any, in practice).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I guess we can leave it for later. I need to update the interpretation of casts to handle exact types at some point anyway.

Comment on lines +239 to +240
(ref.null noextern) ;; The change from stringref to externref does not
;; cause problems here, nothing needs to change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this unlock any potential code simplifications in StringLowering.cpp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered this too, but the answer is actually no. While we are doing something much simpler, logically - swapping a type for a supertype, rather than an entirely unrelated type - the actual mechanical change is identical. We just swap all appearances of type X for Y, and don't need to handle any subtyping.

@kripken
Copy link
Member Author

kripken commented Mar 18, 2025

We can no longer fuzz strings in v8, as we represent them in a nonstandard way, so I disabled that.

@kripken
Copy link
Member Author

kripken commented Mar 18, 2025

I added a readme mention with references to the relevant proposals in the last commit.

@kripken
Copy link
Member Author

kripken commented Mar 18, 2025

@tlively I am going to fix this fuzztest error:

https://github.com/WebAssembly/binaryen/actions/runs/13933139917/job/38994921125?pr=7373

but only because luckily it seems obvious what the issue is. In general, how are people intended to debug errors like that?

@tlively
Copy link
Member

tlively commented Mar 19, 2025

FUZZTEST_PRNG_SEED=(whatever was printed in the failing log) ./bin/binaryen-unittest --gtest_filter=TypeFuzzTest.TestHeapTypeRelationsFuzz should deterministically reproduce the failure.

@tlively
Copy link
Member

tlively commented Mar 19, 2025

The fact that this shows up in CI at all is a nice property of FuzzTest over our other fuzzer patterns. (Although of course we could run the other fuzzers for limited time on CI as well.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants