Skip to content

Undocumented breaking change in TextElementEnumerator.ElementIndex in .NET 5.0 #111510

Open
@danpere

Description

@danpere

Description

System.Globalization.TextElementEnumerator.ElementIndex has different behavior in .NET 5.0+ and all other versions of .NET when using the entrypoint StringInfo.GetTextElementEnumerator(string str, int index). Specifically, in .NET 5.0+, the ElementIndex starts counting from 0 and in other versions of .NET, it starts counting from index (i.e., the indexes are the same as you'd get enumerating the whole string). While I think the latter behavior makes more sense, given that I can't find an indication anyone else has noticed, changing the behavior probably isn't worth breaking it again. But the documentation should at least mention the change.

Reproduction Steps

In Visual Studio's C# interactive window:

> #reset 64
Resetting execution engine.
Loading context from 'CSharpInteractive.rsp'.
> var xs = System.Globalization.StringInfo.GetTextElementEnumerator("abc", 1); xs.MoveNext(); xs.ElementIndex
1
> #reset core
Resetting execution engine.
Loading context from 'CSharpInteractive.rsp'.
> var xs = System.Globalization.StringInfo.GetTextElementEnumerator("abc", 1); xs.MoveNext(); xs.ElementIndex
0

Expected behavior

The output of that code shouldn't depend on the version of .NET used. (Note that the behavior for certain strings is expected to differ due to different levels of Unicode support. But the example string is plain ASCII so that is not relevant here.)

Actual behavior

Output differs as described.

Regression?

This behavior appears to have changed in .NET 5.0.

Known Workarounds

My workaround, since I am calling this code from a .NET Standard library that could be run on different .NET versions, is to record the first .ElementIndex reported and use that to adjust the output.

Configuration

$ dotnet --info
.NET SDK:
 Version:           9.0.102
 Commit:            cb83cd4923
 Workload version:  9.0.100-manifests.4a54b1a6
 MSBuild version:   17.12.18+ed8c6aec5

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.26100
 OS Platform: Windows
 RID:         win-x64


Host:
  Version:      9.0.1
  Architecture: x64
  Commit:       c8acea2262

Other information

The change appears to have been introduced (by accident?) in #328, which does not mention the change, making me think it was unintentional.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions