-
-
Notifications
You must be signed in to change notification settings - Fork 667
[Project Fluent] Basic internationalization support (would close PR #2553) #2674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
philtweir
wants to merge
16
commits into
atuinsh:main
Choose a base branch
from
philtweir:feature/i18n-support-fluent-machinery
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
28623dd
feat(i18n): support for project fluent
philtweir 463e7a2
fix: set GB to ensure the canary message gets some more visibility on…
philtweir 7263a80
fix: ignoring i18n-embed issue, as does not affect usage here - resto…
philtweir 6bed86d
chore: update cargo
philtweir 74ede39
chore: resolve clippy complaints
philtweir 423fff8
chore: resolve clippy and fmt complaints
philtweir 52f1f4d
fix: remove unnecessary components from i18n-embed
philtweir 29abd4d
feat(i18n): add an irish translation to canary as a locale that does …
philtweir 18f6283
docs: add explanation of macro
philtweir ed39268
tests(i18n): add tests for translation strings
philtweir 7d7525e
chore: resolve clippy and fmt complaints
philtweir 501a914
tests(i18n): try fluent selectors also
philtweir 6bc57ad
tests(i18n): improve naming
philtweir f164ba5
chore: resolve clippy and fmt complaints
philtweir 26e64b9
tests(i18n): extend fluent selectors
philtweir 7829f1e
chore: resolve clippy and fmt complaints
philtweir File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# (Required) The language identifier of the language used in the | ||
# source code for gettext system, and the primary fallback language | ||
# (for which all strings must be present) when using the fluent | ||
# system. | ||
fallback_language = "en-GB" | ||
|
||
# Use the fluent localization system. | ||
[fluent] | ||
domain = "atuin" | ||
# (Required) The path to the assets directory. | ||
# The paths inside the assets directory should be structured like so: | ||
# `assets_dir/{language}/{domain}.ftl` | ||
assets_dir = "../../i18n" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
use i18n_embed::{ | ||
DesktopLanguageRequester, | ||
fluent::{FluentLanguageLoader, fluent_language_loader}, | ||
}; | ||
pub use i18n_embed_fl::fl; | ||
use rust_embed::RustEmbed; | ||
|
||
#[derive(RustEmbed)] | ||
#[folder = "../../i18n"] // path to the compiled localization resources | ||
struct Localizations; | ||
|
||
pub use atuin_macro::tl; | ||
use lazy_static::lazy_static; | ||
|
||
lazy_static! { | ||
// We assume that one LOADER is sufficient. Fluent provides more | ||
// flexibility, but for now, this simplifies integration. | ||
pub static ref LOADER: FluentLanguageLoader = { | ||
// Load languages from central internationalization folder. | ||
let language_loader: FluentLanguageLoader = fluent_language_loader!(); | ||
let requested_languages = DesktopLanguageRequester::requested_languages(); | ||
|
||
let _result = i18n_embed::select( | ||
&language_loader, &Localizations, &requested_languages); | ||
language_loader | ||
}; | ||
} | ||
|
||
#[macro_export] | ||
macro_rules! t { | ||
// Case that t!("foo bar") is called with no runtime parameters to interpolate. | ||
($message_id:literal) => { | ||
$crate::i18n::tl!($crate::i18n::fl, $crate::i18n::LOADER, $message_id) | ||
}; | ||
|
||
// Case that t!("foo %{bar}", bar=baz.to_string()) is called with runtime parameters to interpolate. | ||
($message_id:literal, $($args:expr),*) => {{ | ||
$crate::i18n::tl!($crate::i18n::fl, $crate::i18n::LOADER, $message_id, $($args), *) | ||
}}; | ||
} | ||
|
||
pub use t; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -54,6 +54,7 @@ macro_rules! new_uuid { | |
} | ||
|
||
pub mod api; | ||
pub mod i18n; | ||
pub mod record; | ||
pub mod shell; | ||
pub mod utils; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
[package] | ||
name = "atuin-macro" | ||
edition = "2021" | ||
description = "macro library for atuin" | ||
|
||
rust-version = { workspace = true } | ||
version = { workspace = true } | ||
authors = { workspace = true } | ||
license = { workspace = true } | ||
homepage = { workspace = true } | ||
repository = { workspace = true } | ||
|
||
[lib] | ||
proc-macro = true | ||
|
||
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html | ||
|
||
[dependencies] | ||
time = { workspace = true } | ||
serde = { workspace = true } | ||
uuid = { workspace = true } | ||
typed-builder = { workspace = true } | ||
eyre = { workspace = true } | ||
sqlx = { workspace = true } | ||
semver = { workspace = true } | ||
thiserror = { workspace = true } | ||
directories = { workspace = true } | ||
sysinfo = "0.30.7" | ||
base64 = { workspace = true } | ||
getrandom = "0.2" | ||
sys-locale = "0.3.2" | ||
|
||
lazy_static = "1.4.0" | ||
i18n-embed = { version = "0.15.3", features = ["fluent", "fluent-system", "tr", "locale_config", "desktop-requester", "walkdir", "filesystem-assets"] } | ||
rust-embed = "8" | ||
i18n-embed-fl = "0.9.3" | ||
slugify = "0.1.0" | ||
paste = "1.0.15" | ||
syn = "2.0.96" | ||
quote = "1.0.38" | ||
proc-macro2 = "1.0.93" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# (Required) The language identifier of the language used in the | ||
# source code for gettext system, and the primary fallback language | ||
# (for which all strings must be present) when using the fluent | ||
# system. | ||
fallback_language = "en-GB" | ||
|
||
# Use the fluent localization system. | ||
[fluent] | ||
domain = "atuin" | ||
# (Required) The path to the assets directory. | ||
# The paths inside the assets directory should be structured like so: | ||
# `assets_dir/{language}/{domain}.ftl` | ||
assets_dir = "./tests/i18n" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
#![forbid(unsafe_code)] | ||
extern crate proc_macro; | ||
|
||
use proc_macro::TokenStream; | ||
use proc_macro2::TokenStream as TokenStream2; | ||
use quote::quote; | ||
use slugify::slugify; | ||
use syn::parse::Parser; | ||
|
||
fn literal_to_slug(literal: &syn::ExprLit) -> TokenStream2 { | ||
// We pull out the actual text from the literal string expression. | ||
let quoted: String = match &literal.lit { | ||
syn::Lit::Str(message_id) => message_id.value(), | ||
_ => panic!("Message ID must be a literal string"), | ||
}; | ||
// ...and pass it to slugify, | ||
let slug = slugify!(quoted.as_str()); | ||
// ...then turn it back into a literal string. | ||
quote!(#slug) | ||
} | ||
|
||
#[proc_macro] | ||
pub fn tl(tokens: TokenStream) -> TokenStream { | ||
// Begin by getting the individual arguments to tl! | ||
let args = syn::punctuated::Punctuated::<syn::Expr, syn::Token![,]>::parse_terminated | ||
.parse(tokens) | ||
.unwrap(); | ||
|
||
let mut arg_iter = args.iter(); | ||
|
||
// The first should always be the fl! macro for fluent (cf. atuin-common/src/i18n.rs) | ||
let fl = arg_iter.next().unwrap(); | ||
// atuin-common will send the universal loader as the second argument. This avoids | ||
// every translation string having to explicitly pass it. | ||
let loader = arg_iter.next().unwrap(); | ||
|
||
// The third argument should be the message ID. This logic takes the human-readable | ||
// string and slugifies it. One of the main benefits of Fluent is that English-language | ||
// ASCII is not the de facto reference (and things like gender and plurality can be | ||
// encoded even where English makes no grammatical distinction). However, this approach | ||
// still allows the `fl!` macro to be used directly, but saves having to switch all | ||
// strings to slugs throughout the codebase just to make them translatable at all. | ||
|
||
// It is possible that the string literal representing the message (e.g. "Danger, Bill Bobinson") | ||
// appears wrapped in a group or not, so we handle both possibilities. | ||
// We use literal_to_slug to turn it to a slug, e.g. "danger-bill-bobinson" | ||
let message_id: proc_macro2::TokenStream = match arg_iter.next() { | ||
Some(syn::Expr::Group(arg)) => match *arg.expr.clone() { | ||
syn::Expr::Lit(arg) => literal_to_slug(&arg), | ||
arg => panic!("Message ID {:?} must be a literal", arg), | ||
}, | ||
Some(syn::Expr::Lit(arg)) => literal_to_slug(arg), | ||
arg => panic!("Message ID {:?} must be a literal", arg), | ||
}; | ||
|
||
// Reconstruct the arguments that we initially had, and pull in any extra ones | ||
// that should go right through to Fluent. For example: | ||
// t!("Danger ${name}", name="Bill Bobinson") | ||
// -> tr!(fl, LOADER, "Danger ${name}", name="Bill Bobinson") | ||
// -> fl!(LOADER, "danger-name", name="Bill Bobinson") | ||
// `danger-name` is then searched for in the i18n/ folder, and should map | ||
// to a template like `Danger, { $name }` that Fluent can insert the parameter into. | ||
let args: Vec<_> = arg_iter.collect(); | ||
|
||
// If there are no parameters, then Fluent can do this entirely statically. | ||
// Otherwise, it will require runtime interpolation. | ||
if args.is_empty() { | ||
TokenStream::from(quote!( | ||
#fl!(#loader, #message_id) | ||
)) | ||
} else { | ||
TokenStream::from(quote!( | ||
#fl!(#loader, #message_id, #(#args),*) | ||
)) | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
use i18n_embed::fluent::{fluent_language_loader, FluentLanguageLoader}; | ||
pub use i18n_embed_fl::fl; | ||
use lazy_static::lazy_static; | ||
use rust_embed::RustEmbed; | ||
|
||
#[derive(RustEmbed)] | ||
#[folder = "tests/i18n"] // path to the compiled localization resources | ||
struct Localizations; | ||
|
||
pub use atuin_macro::tl; | ||
|
||
lazy_static! { | ||
// We assume that one LOADER is sufficient. Fluent provides more | ||
// flexibility, but for now, this simplifies integration. | ||
pub static ref LOADER: FluentLanguageLoader = { | ||
// Load languages from central internationalization folder. | ||
let language_loader: FluentLanguageLoader = fluent_language_loader!(); | ||
let requested_languages = vec!["en-GB".parse().unwrap()]; | ||
|
||
let _result = i18n_embed::select( | ||
&language_loader, &Localizations, &requested_languages); | ||
language_loader | ||
}; | ||
} | ||
|
||
#[test] | ||
fn basic_tl_without_parameter() { | ||
assert_eq!( | ||
tl!(fl, LOADER, "Danger, Bill Bobinson"), | ||
"Danger, William of Bobinson" | ||
); | ||
} | ||
|
||
#[test] | ||
fn basic_tl_with_parameter() { | ||
assert_eq!( | ||
tl!( | ||
fl, | ||
LOADER, | ||
"unrecognized subcommand '%{subcommand}'", | ||
subcommand = "SUB" | ||
), | ||
"unrecognised subcommand '\u{2068}SUB\u{2069}'" | ||
); | ||
} | ||
|
||
#[test] | ||
fn tl_with_non_en_range_without_parameter() { | ||
let language_loader: FluentLanguageLoader = fluent_language_loader!(); | ||
let requested_languages = vec!["ga-IE".parse().unwrap()]; | ||
|
||
let _result = i18n_embed::select(&language_loader, &Localizations, &requested_languages); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "Danger, Bill Bobinson"), | ||
"Contúirt, a Uilliam Mac Bhoboin" | ||
); | ||
} | ||
|
||
#[test] | ||
fn tl_with_non_en_range_with_parameter() { | ||
let language_loader: FluentLanguageLoader = fluent_language_loader!(); | ||
let requested_languages = vec!["hi-IN".parse().unwrap()]; | ||
|
||
let _result = i18n_embed::select(&language_loader, &Localizations, &requested_languages); | ||
|
||
assert_eq!( | ||
tl!( | ||
fl, | ||
language_loader, | ||
"Hello, my name is %{name}", | ||
name = "रीमा" | ||
), | ||
"नमस्ते, मेरा नाम \u{2068}रीमा\u{2069} है।" | ||
); | ||
} | ||
|
||
#[test] | ||
fn tl_with_selector_parameter() { | ||
let language_loader: FluentLanguageLoader = fluent_language_loader!(); | ||
|
||
let _result = i18n_embed::select( | ||
&language_loader, | ||
&Localizations, | ||
&vec!["en-GB".parse().unwrap()], | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "f"), | ||
"the user that has files" | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "m"), | ||
"the user that has files" | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "o"), | ||
"the user that has files" | ||
); | ||
|
||
let _result = i18n_embed::select( | ||
&language_loader, | ||
&Localizations, | ||
&vec!["ga-IE".parse().unwrap()], | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "f"), | ||
"an t-úsáideoir a bhfuil comhaid aici" | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "m"), | ||
"an t-úsáideoir a bhfuil comhaid aige" | ||
); | ||
|
||
assert_eq!( | ||
tl!(fl, language_loader, "the user that has files", gender = "o"), | ||
"an t-úsáideoir a bhfuil comhaid acu" | ||
); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
unrecognized-subcommand-subcommand = | ||
unrecognised subcommand '{ $subcommand }' | ||
danger-bill-bobinson = | ||
Danger, William of Bobinson | ||
hello-my-name-is-name = | ||
Hello, my name is { $name } | ||
the-user-that-has-files = | ||
{ $gender -> | ||
*[other] the user that has files | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
danger-bill-bobinson = | ||
Contúirt, a Uilliam Mac Bhoboin | ||
the-user-that-has-files = | ||
{ $gender -> | ||
[f] an t-úsáideoir a bhfuil comhaid aici | ||
[m] an t-úsáideoir a bhfuil comhaid aige | ||
*[other] an t-úsáideoir a bhfuil comhaid acu | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
hello-my-name-is-name = | ||
नमस्ते, मेरा नाम { $name } है। |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generally been pretty against macros in our codebase but in this case I think it makes sense
obv this is just a draft but for the final version I'd really appreciate it if you could thoroughly document what's going on with the code here, just to make it as approachable as possible for future contributors! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course - will do! This was the main point against fluent-rs, alongside activity, as their macro (for conceptual reasons) requires slug-strings in the codebase rather than human readable ones, so using it means one of:
2 has the downside of making things like making UI bugs harder to see in reviews, and grepping for error messages harder, as well as more visual change to existing code when making strings translatable, so I thought that was probably too much friction. On the other hand, as mentioned, 1 adds custom magical machinery, a new macro module, and somewhat undermines fluent's own differentiating motivation from gettext (although, of course, it is still possible to use it in the intended way on a case-by-case basis).
However, one other benefit of fluent is that it is a standard and seems to be supported by (for example) Weblate - it might even be possible to get a free instance on their libre-project plan, if desired.