Skip to content

nbsp char in xml name allowed #578

@wangyoutian

Description

@wangyoutian

1. Description

if we append a char nbsp (0xa0) to element name, it's parsed normally without exception thrown.

eg:

<constituent></constituent  >note in the end tag before this sentence, char 0xA0, not 0x20, is appended<constituent></constituent>

will be parsed as one "constituent" element, not two.
And the problem is suppressed (which is not good), and it's hard to debug, as 0xA0 is visually indiscernible from 0x20.

2. Expectation

https://dev.w3.org/html5/spec-LC/syntax.html#:~:text=HTML%20elements%20all%20have%20names,005A%20LATIN%20CAPITAL%20LETTER%20Z.

doesnot allow such chars in element name.

nor xml allows as stipulated in:

http://w3.org/TR/REC-xml/#NT-NameStartChar

;

Otherwise, it's hard to pin down the issue.

Solution?

Should we in documentation explicitly allow such chars or should we throw exception?

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions