Reading to multiple string types in ygm::io::line_parser #369
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a template parameter for the string type to use when reading individual lines within the
ygm::io::line_parserand its derivatives.This change allows UTF-8 files with non-ASCII characters to read to a type such as
std::u32stringthat can appropriately handle an extended character set. This has the advantage of correctly reporting the size of strings andoperator[]returning the correct characters. It has the disadvantages of requiring 4 bytes of storage for all characters (including ASCII characters that could be stored with 1 byte) and needing a user to convert the used strings before printing or writing to file (using std::codecvt or whatever replaces it since its deprecation).Without this change, the
ygm::io::line_parseris able to read all legal UTF-8 characters (from my testing using files here). Lines with non-ASCII characters read into astd::stringtreat the line as a collection of single-byte characters giving incorrect sizes and accessing of characters.