-
Notifications
You must be signed in to change notification settings - Fork 392
Description
This is a follow-up from a comment I made in #823; I created this separate issue to better discuss the plans for such an interface and determine what would be the best course of action.
I am positing the idea for a localisation interface as the most natural way to implement custom localisation and interntionalisation behaviour.
I've created several sections for potential functionality.
Translating strings.
Right now this is handled by the system interface, via TranslateString. This would be moved to the localisation inference. Obviously, this would be a breaking change, so I assume we'd have to wait for the next major release if this is the direction we go with.
Case-transforming strings.
These methods would be called to transform text with the text-transform property. I suggest the following signature:
virtual void TransformString(String& string, Style::TextTransform transform, const String& language);where string is the string to be transformed in-place, transform is the value of the text-transform property, and language is the element's language (set by the lang attribute). I am suggesting in-place modification mainly due to efficiency.
The current implementation transforms strings in-place, which works fine for ASCII. However, some languages have special case rules depending on a letter's position in a string and what letters it is adjacent to (examples include the final sigma in Greek and IJ at the start of words in Dutch), so passing the whole string will be necessary in such cases.
Alternatively, we could make the signature similar to TranslateString's for consistency, where it is given an both an input string and output string (at the cost of some performance).
Also, after looking through the code, it seems that the capitalize value isn't actually implemented, and is synonymous with none; I'll get around to fixing that later.
Bidiretional text segmentation.
Has been partially discussed before, but it will be given a string and segment it based on text-flow direction. Perhaps it could be given an iterator that represents the beginning and returns an iterator corresponding to the end of the segment (and other information about the segment's properties)?
Another thing to handle with direction segmenting are mirror agents (for example, swapping ( and ) in certain places).
Word breaking.
Pretty self-explanatory. Basically splitting words better than only with whitespace (for example, by punctuation). Useful for text layout and determining where text can be wrapped.
Grapheme breaking.
Similar to word breaking, but for detecting which codepoint clusters correspond to a single visible glyph. Would be useful for proper text highlighting.
Codepoint properties.
Some extra methods for determining the property of a codepoint. For example, if a character is whitespace, uppercase, number, et cetera.
Example method:
virtual bool IsWhitespace(Character character);To begin, we can create an implementation with basic functionality, and implement the other, more-advanced features at a later date. I've had a look through the code, and it shouldn't be too difficult to create an implementation that handles some of the above features. Here's a sample of what could be implemented to start:
class RMLUICORE_API LocalizationInferface : public NonCopyMoveable {
public:
LocalizationInferface();
virtual ~LocalizationInferface();
virtual int TranslateString(String& translated, const String& input);
virtual void TransformString(String& string, Style::TextTransform transform, const String& language);
virtual bool IsWhitespace(Character character);
};The default interface would assume that every string is ASCII-encoded. A possible stretch-goal would be to eventually include a localisation backend that uses ICU or similar Unicode algorithms library.
Anyway, these were just my thoughts for improving localisation/internationalisation in this project. I created this issue so that we to discuss improvements, suggestions, and nuances to this new potential interface API.