You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thunder uses multiple relational databases as storage backends. For proper handling of international characters, most string columns should use a UTF-8 collation.
However, applying UTF-8 to all columns may not be necessary. Columns that store identifiers, numeric strings, flags, or other ASCII-only values may not require UTF-8. Using UTF-8 unnecessarily can increase index sizes, storage footprint, and may impact query performance for indexed columns.
There is a need to define which columns should enforce UTF-8 collation and which can remain with simpler ASCII or default database collation, balancing internationalization support with storage and performance considerations.
Questions for Discussion
Which types of columns must have UTF-8 collation (e.g., user-facing text, names, email addresses)?
Are there columns where UTF-8 is unnecessary (e.g., internal IDs, boolean flags, numeric strings)?
What is the expected performance and storage impact of UTF-8 on indexes and large tables?
Should Thunder enforce a database-wide standard, or allow selective collation per column?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Problem Summary
Thunder uses multiple relational databases as storage backends. For proper handling of international characters, most string columns should use a UTF-8 collation.
However, applying UTF-8 to all columns may not be necessary. Columns that store identifiers, numeric strings, flags, or other ASCII-only values may not require UTF-8. Using UTF-8 unnecessarily can increase index sizes, storage footprint, and may impact query performance for indexed columns.
There is a need to define which columns should enforce UTF-8 collation and which can remain with simpler ASCII or default database collation, balancing internationalization support with storage and performance considerations.
Questions for Discussion
Beta Was this translation helpful? Give feedback.
All reactions