Open
Description
In JSON Schema specification maxLength
keyword specifies the amount of characters in field, not bytes. However, in Redshift (as in most databases, I believe) in VARCHAR
we specify amount of bytes, which may introduce mismatch in text-fields, usually written by humans, not computers.
This is generally not a problem as analytical data typically contains ASCII-text (written by computers), where amount of bytes precisely match amount of characters.
But at the same time, I can imagine an issue:
- User supposed to enter his/her city name in native language (likely with non-ASCII characters)
- Web-developer constrains input-field to 32 characters
- Analysts makes a wrong assumption that
maxLength: 32
is correct constrain - Redshift truncates all non-ASCII city names to 16 characters
This could be done as part of #170 (format: "unicode"
, which specify that string has absolutely no structure) or similar custom JSON-schema extension.
Metadata
Assignees
Labels
No labels