Skip to content

Conversation

@Tacodiva
Copy link
Member

Add the ability for the compiler to track if a string contains uppercase or lowercase characters, and tries to reduce conversions between uppercase and lowercase strings.

I haven't found any real-world benchmark speedups from this yet, but it does make the compiler produce cleaner code.

@nimeratus
Copy link

There are some characters where converting to upper case before comparison is different than converting to lower case, for example the German letter ß (\u00df), which doesn't convert back into itself if you uppercase and then lowercase it

Also, there are some characters that are encoded as a surrogate pair but change when lowercased, which means that joining two strings that are lowercasing-invariant can produce a string that changes when lowercased. Example: 𐐀 (\u{10400} = \uD801\uDC00)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants