Suggestion: include Unicode codepoint standard names to sym.txt #6
Description
Many symbols in sym.txt
are specified as their Unicode codepoint in the form U+XXXX
rather than a plain character, because it would be hard to parse or notice when reading the file later. I believe using the Unicode-assigned name of such characters would be more useful and self-documenting than simply entering the code point.
Ideally, these names would be machine-checked in build.rs
rather than just act as informative comments, to ease the minds of reviewers from ensuring the right name is provided for each character. These names could also then be used to look up the wanted Unicode codepoint thereby entirely replacing the U+
scalar reference.
Either way, we could opt to include the names even on characters that are directly embedded in the txt files just to have more context directly available when editing them (though this is definitely more of a bonus/personal preference change and should be discussed separately.)