Skip to content

Conversation

@mscuthbert
Copy link
Contributor

@mscuthbert mscuthbert commented May 23, 2025

This updates the regular expression for the <ending number="..."> attribute, to allow for multiple spaces after the comma: <ending number="2, 3">

The schema change is finished -- I want to integrate the doctools better into a testing eco-system.

See #580

This updates the regular expression for the `<ending number="...">` attribute, to allow for multiple spaces after the comma: `<ending number="2,  3">`

The schema change is finished -- I want to integrate the doctools better into a testing eco-system.
@mdgood
Copy link

mdgood commented May 23, 2025

This is wrong. Tokens "have no internal sequences of two or more spaces." See the definition at https://www.w3.org/TR/xmlschema-2/#token.

Copy link

@mdgood mdgood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please close or delete this PR as it violates the definition of an XML Schema token.

@mscuthbert
Copy link
Contributor Author

mscuthbert commented May 23, 2025

Hi Michael -- I'll do that, but can you explain why the first half of the regex makes sense then:

([ ]*)|([1-9][0-9]*(, ?[1-9][0-9]*)*)

As I now understand it, if "1, 2" is not a valid token then neither is " " -- as it was the first half is validating against the pre-normalized version and the second half is validating against the normalized version? That does not make sense to me.

If these regex's are going on post normalized, should it not be

()|([1-9][0-9]*(, ?[1-9][0-9]*)*)

@mdgood
Copy link

mdgood commented May 23, 2025

That case is to cover the empty string. I don't remember why it was specified that way instead of what you propose - if that wasn't a valid schema regex, if there was a problem with parser software handling it, or I just used the first thing I thought of. It may not be ideal, but at least let's not make it any worse. 😀

@mscuthbert
Copy link
Contributor Author

That case is to cover the empty string. I don't remember why it was specified that way instead of what you propose - if that wasn't a valid schema regex, if there was a problem with parser software handling it, or I just used the first thing I thought of. It may not be ideal, but at least let's not make it any worse. 😀

Agreed -- at least you can see that I wanted to make both sides consistent. We'll go the opposite direction.

But this ends up not being a change requiring a XST since the parser will trim it in 4.0 and in 4.1

@mscuthbert
Copy link
Contributor Author

The docs changes got caught up in the move from Windows line-endings to Unix. Here's the relevant added lines in the docs:

<h3 id="values">Changed Attributes/Values</h3>
<ul>
<li>The <a href="../../musicxmlreference/data-types/ending-number/">ending-number</a> type used
    in the <code>number</code> attribute of the
    <a href="../../musicxml-reference/elements/ending/">&lt;ending&gt;</a>, previously had a regex
    that implied that empty spaces were different from the empty string.  This has been clarified.
</li>
</ul>


<h2 id="documentation">Documentation Changes</h2>
<li>The <a href="../../musicxmlreference/data-types/ending-number/">ending-number</a> type used
    in the <code>number</code> attribute of the
    <a href="../../musicxml-reference/elements/ending/">&lt;ending&gt;</a>, previously had documentation
    implying that empty spaces were different from the empty string.  This has been clarified.
</li>

</xs:annotation>
<xs:restriction base="xs:token">
<xs:pattern value="([ ]*)|([1-9][0-9]*(, ?[1-9][0-9]*)*)"/>
<xs:pattern value="()|([1-9][0-9]*(, ?[1-9][0-9]*)*)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it makes sense to change ' ?' to '[ ]?' to increase readability.

<li>The <a href="../../musicxmlreference/data-types/ending-number/">ending-number</a> type used
in the <code>number</code> attribute of the
<a href="../../musicxml-reference/elements/ending/">&lt;ending&gt;</a>, previously had a regex
that implied that empty spaces were different from the empty string. This has been clarified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Empty spaces' sounds like a pleonasm 🙂

I suggest to replace this with 'a sequence of consecutive space characters' or something like that.

@mdgood
Copy link

mdgood commented Jun 9, 2025

I still don't think we should make this change as I don't see it adding value, only risk. It doesn't increase the power of what MusicXML can represent and this hasn't been a significant point of confusion in 17 years. Any change requires testing to make sure it works and documentation to explain the change. The cost isn't worth it.

@mscuthbert
Copy link
Contributor Author

Agreeing with Michael -- not sufficient improvement to make a change that could break some (non-conforming) code out there. Closing.

@mscuthbert mscuthbert closed this Aug 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants