-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat(funbox): added support for cyrillic and arabic charset (@m4dd0c) #6488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Arabic funbox config added
Continuous integration check(s) failed. Please review the failing check's logs and make the necessary changes. |
also for the names, like here "arabic" feels too generic and misses the random word vibe, unlike Before all of that also wait for @Miodec to approve everything. |
also idea , since those are nearly gibberish again , instead of making those a new funboxes shouldn't we just try to make the gibberish mode support different lanuages ? like selecting the gibberish funbox using english,to generate a latin letters , if with using arabic to generate and show arabic letters and so on !? |
I was thinking the same |
Great Idea actually. I'd love to work on it.
|
Hi @m4dd0c Very nice work! |
I checked and they are fine connected as it should, however I noticed another thing here that the arabic ranges have the extended Arabic letters and they aren't typeable for the Arabic languages (arabic , Arabic_Egypt) as they use standard characters only, but the extended letters may be used in different languages l(e.g., Persian, Kurdish, Urdu) so I think we should separate the range to be at least like two , one as standard for the language and a general one, eg standard_arabic: [ // this witout the extended
{ start: 1569, end: 1594 }, // U+0621–U+063A (ء to غ)
{ start: 1601, end: 1608 }, // U+0641–U+0648 (ف to و)
{ start: 1610, end: 1610} // U+064A (ي)
arabic: { // this with
start: 1569, // ء (U+0621)
end: 1610 // ي (U+064A)
}, also I think some other languages may have that problem too, so doing the same approach ( one without the extended & one with the extended) or doing range for each language separately will be better I guess , we can just leave the rest and wait for a native user to put them correctly . or we can leave them like that generally and wait for navtive user for each language to notice and if he felt the need to change/fix the range, he do it himself or open an issue with it, idk |
I intentionally kept the charset range minimal, since there are plenty of bloated letters that may cause issues while typing. Furthermore, If I go with the approach you have mentioned then I wonder how someone would select between the standard and non-standard version of different languages? What I have in mind is, We can have 2 funboxes,
Thoughts? |
I think mios idea was to not overcomplicate this feature. #6488 (comment) With latin we use the most basic alphabet a-z. For like italian we would have unused letters (like j and k) and are missing letters like (è). Can we do the same for the other languages, find a minimal, common set which should be typeable? |
This PR, Currently supporting minimal, common set range only as @Miodec suggested. |
No one will select anything, and we don't have to make a two different modes, the languages will just use different charset names in the language files , language eg. for the arabic with standard letters (arabic , Arabic_Egypt) will put in the charset, |
Okay that sound the best solution to not overcomplacate the things more than that here , I did the search and that may be the best ranges we can go with for now , all without the extended letters, rare characters, or problematic symbols arabic: [
{ start: 1569, end: 1594 }, // U+0621–U+063A (ء to غ)
{ start: 1601, end: 1608 }, // U+0641–U+0648 (ف to و)
{ start: 1610, end: 1610} // U+064A (ي)
],
latin: {
start: 97, // a (U+0061)
end: 122 // z (U+007A)
},
cyrillic: {
start: 1072, // а (U+0430)
end: 1103 // я (U+044F)
},
devanagari: [
{ start: 2309, end: 2361 }, // U+0905–U+0939 (अ to ह)
{ start: 2366, end: 2376 } // U+093E–U+0948 (vowel signs आ to ऐ)
],
geez: [
{ start: 4768, end: 4960 } // U+1200–U+135F (ሀ to ፟)
],
tamil: [
{ start: 2949, end: 3020 }, // U+0B85–U+0BBC (அ to ஔ)
{ start: 3006, end: 3028 } // U+0BBE–U+0BCC (vowel signs ா to ௌ)
],
telugu: [
{ start: 3077, end: 3148 }, // U+0C05–U+0C4C (అ to ౌ)
{ start: 3158, end: 3160 } // U+0C56–U+0C58 (additional vowels ౖ to ౘ)
],
bengali: [
{ start: 2437, end: 2489 }, // U+0985–U+09B9 (অ to হ)
{ start: 2494, end: 2508 } // U+09BE–U+09CC (vowel signs া to ৌ)
],
malayalam: [
{ start: 3333, end: 3396 }, // U+0D05–U+0D3C (അ to ഹ)
{ start: 3398, end: 3404 } // U+0D3E–U+0D44 (vowel signs ാ to ൄ)
],
kannada: [
{ start: 3205, end: 3268 }, // U+0C85–U+0CBC (ಅ to ಹ)
{ start: 3270, end: 3276 } // U+0CBE–U+0CC4 (vowel signs ಾ to ೄ)
],
burmese: [
{ start: 4096, end: 4138 } // U+1000–U+102A (က to ဪ)
],
tibetan: [
{ start: 3904, end: 3911 } // U+0F40–U+0F47 (ཀ to ཧ)
],
sinhala: [
{ start: 3461, end: 3516 }, // U+0D85–U+0DBC (අ to හ)
{ start: 3535, end: 3551 } // U+0DCF–U+0DDF (vowel signs to ෟ)
],
hebrew: {
start: 1488, // א (U+05D0)
end: 1514 // ת (U+05EA)
},
thai: [
{ start: 3585, end: 3631 } // U+0E01–U+0E2F (ก to ๏)
],
greek: {
start: 945, // α (U+03B1)
end: 969 // ω (U+03C9)
},
han: [
{ start: 19968, end: 27903 } // U+4E00–U+6CAF (common CJK ideographs)
],
hangul: {
start: 44032, // 가 (U+AC00)
end: 55203 // 힣 (U+D7A3)
},
khmer: [
{ start: 6016, end: 6067 } // U+1780–U+17B3 (ក to ឳ)
],
ol_chiki: [
{ start: 7248, end: 7293 } // U+1C5A–U+1C7D (ᱚ to ᱽ)
],
hiragana: {
start: 12353, // あ (U+3041)
end: 12438 // ん (U+3096)
},
katakana: {
start: 12449, // ア (U+30A1)
end: 12538 // ン (U+30FA)
}
};
|
Well, I like the idea. I'll make proposed changes. I will update the |
Refactor charsetRanges to support multiple ranges per charset. Updated getGibberish to utilize the new structure for generating random gibberish strings. This improves flexibility and allows handling of complex charsets with multiple ranges. BREAKING CHANGE: charsetRanges structure has been modified to an array of ranges instead of a single range object. Update any dependent code accordingly.
Continuous integration check(s) failed. Please review the failing check's logs and make the necessary changes. |
Description
Added Arabic & Russian (cyrillic) Funboxes
Added two new funboxes
Arabic
&Russian
with gibberish word generators. Also added logic to automatically force the Arabic language if the Arabic funbox is active to prevent config issues.Changes
getArabic()
andgetRussian()
utility functions.funbox-functions.ts
andlist.ts
as Metadata.setLanguage()
when needed.FunboxName types
.Closes #6181
Let me know If any changes are required.