Skip to content

Show percent translation complete for language options#6370

Open
sam52796 wants to merge 1 commit into
pioneerspacesim:masterfrom
sam52796:Translation_percent
Open

Show percent translation complete for language options#6370
sam52796 wants to merge 1 commit into
pioneerspacesim:masterfrom
sam52796:Translation_percent

Conversation

@sam52796

@sam52796 sam52796 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Addresses #6366, mostly.

When showing the language options, we now scan all the language files to see what percentage of the strings are the same as the English version, i.e. those strings are not translated yet. The percentage of translated strings is then shown in the menu next to each language - e.g. "Francais (94%)" except for English itself which has "(default)" instead.

The default English is always put at the top of the list.

Any language with 0% translated strings is not shown in the list.

Yes, this is a bit of a blunt tool, because any string that is deliberately the same as the English version, like "OK", will count as not translated, while any string that was originally translated but is now out of date will count as translated. But it is probably accurate enough overall to be useful.

I couldn't do anything programmatically about the fact that some translators have literally translated the name "English" into the word for English in their language, instead of the name of their language in their language. I think that may have to be done on Transifex.

@impaktor

Copy link
Copy Markdown
Member

Fixes #6366

This will close the issue, upon merge of this PR, I don't know if that is how you intended it.

Am I correct in assuming that non-English languages will never be shown as 100%, (due to "OK", and other common strings)? Perhaps show a less fine grained score, like 1-10, or similar?

@impaktor

Copy link
Copy Markdown
Member

Also: filtering out low translation languages, perhaps there should be an option to turn off the filter, useful for translators who want to see their work in the game? (Assuming they pull from master regularly). Just a suggestion I'm throwing out there.

@sam52796

Copy link
Copy Markdown
Contributor Author

I didn't actually check to see how many OK equivalents there are, I just was just noting that if there are identical strings, and it's deliberate, then they would decrease the percentage.

We could do something like - don't show the percentage if it's above 90%. This means the vast majority of strings are translated so the user should have no issue picking that language. If at least one person agrees then I can make that change and re-commit.

And yeah, I didn't mean that this PR should close the issue, I just wanted to link them. I've changed the "Fixes" keyword to "Addresses", not sure if that will be enough to prevent it.

@sam52796

Copy link
Copy Markdown
Contributor Author

Also: filtering out low translation languages, perhaps there should be an option to turn off the filter, useful for translators who want to see their work in the game?

With the current change, only languages with literally zero translated strings get filtered out. I think that's a low enough bar. If zero strings are translated then they haven't even translated the name "English" into the name of the other language, so having it in the list will definitely be confusing.

@sam52796 sam52796 force-pushed the Translation_percent branch from 2b96445 to e20fcfe Compare June 22, 2026 10:26
@zonkmachine

Copy link
Copy Markdown
Member

When it comes to a percentage of translated strings, consider that many languages by default don't translate all strings. East Asian languages usually don't translate single keywords but do translate descriptions. All Chinese and Japanese goods/cargo names are untranslated by intent, but their descriptions are. Also leaving mission titles untranslated where mission dialog is. Maybe something like 90% would work or we could get the actual translated string percentage via Transifex.

@sam52796

Copy link
Copy Markdown
Contributor Author

Currently it looks like this:

image

@zonkmachine

zonkmachine commented Jun 22, 2026

Copy link
Copy Markdown
Member

If you look at the language summary page on Pioneers Transifex page you will notice that many languages show up around ~80%. Something happened about two years ago with a lot of untranslated strings, where the English original was accepted and marked as done and suddenly all languages turned up as 100%. This was never solved and then new strings was added so the ones around 80% have in many cases been untouched since that time.
https://app.transifex.com/pioneer/pioneer/languages/
Because of this your percentages will look drastically lower in some cases and be more in line with the actual completeness of the language.

I think these changes are good for now but maybe the translators should be bumped about it.

@RacerBG

RacerBG commented Jun 22, 2026

Copy link
Copy Markdown

The only concern of mine are those "90-99%" which may sound like except English no language is ever finished, basically I agree with what @zonkmachine wrote earlier in the thread.

Also, adding "(default)" to English is not my cup of tea. Just having English at the top sounds good to me.

@impaktor

impaktor commented Jun 22, 2026

Copy link
Copy Markdown
Member

If you look at the language summary page on Pioneers Transifex page you will notice that many languages show up around ~80%. Something happened about two years ago

me:

OIP-2129569137

also:

Because of this your percentages will look drastically lower in some cases and be more in line with the actual completeness of the language.

@zonkmachine this solution doesn't use transifex, it just compares en.json with the others for similarity.

@zonkmachine

Copy link
Copy Markdown
Member

@zonkmachine this solution doesn't use transifex, it just compares en.json with the others for similarity.

Yes, I know. It gives this PR an advantage in this particular case. I think it can be applied as is if we just write a clear note to the translators where the numbers come from. It's maybe a bit hard to fix the issue with approved English lines in other languages this long after it happened. The translators should probably be bumped about this too, so they know what they're up against. The only real case I think this could be a problem is if someone has the habit of selecting to only show untranslated strings AND at the same time not actually testing the game to see what the translations look like in the final product.

@sam52796 sam52796 force-pushed the Translation_percent branch from e20fcfe to a193987 Compare June 23, 2026 08:25
@sam52796

Copy link
Copy Markdown
Contributor Author

There seems to be reasonable agreement on the 90% threshold idea, so I've tweaked the commit so that if a language scores more than 90% then the percentage isn't shown at all.

I also removed "(default)" from English, since I agree it was unnecessary now that the other entries called "English" are being filtered out for having 0% translated strings.

As far as I'm concerned this is ready for merge.

@zonkmachine

Copy link
Copy Markdown
Member

LGTM! I'm not deep into the code of it but I've tested it and it looks good. I think we can merge this.

@impaktor

impaktor commented Jun 23, 2026

Copy link
Copy Markdown
Member

It's maybe a bit hard to fix the issue with approved English lines in other languages this long after it happened

I don't think the time that has passed is what complicates it, but rather finding the how. I did try to find a way to fix it 2 years ago, but came up empty. I think removing everything add adding it back would be one option back then.

Either way, need to run some command that basically flags all strings that are the same as English as "untranslated" in transifex, that would be net positive.

@sam52796 Suggestion: test to have a horizontal line between "English" and the rest of the languages? (to indicate alphabetical order starts below the line)

@zonkmachine

Copy link
Copy Markdown
Member

I don't think the time that has passed is what complicates it, but rather finding the how. I did try to find a way to fix it 2 years ago, but came up empty. I think removing everything add adding it back would be one option back then.

It's something like: the translations of the commit 'before' the change + the strings that has been updated/added 'after' the change.

I'm not sure if it's worth looking into. It may be a small nuisance for the translators, if at all, but the users won't notice it. Except perhaps a couple of strings that may be missed in the translations.

@impaktor

impaktor commented Jun 23, 2026

Copy link
Copy Markdown
Member

It's something like: the translations of the commit 'before' the change + the strings that has been updated/added 'after' the change. ... I'm not sure if it's worth looking into.

I think there are many languages that have a very inflated completion rate on transifex, since the "issue" 2 years ago. We can see the actual error by comparing the translation rates according to this PR, to what transifex shows. (I'll leave that as an exercise to the reader).

@sturnclaw sturnclaw left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've identified a few areas I'd like to see addressed in the code implementation before merge.

Separate from that, please keep in mind that there are likely to be significant false-negatives in fully-translated resources. Names of SI units, unicode symbols, etc. are very unlikely to differ in most translations, several languages use English characters to render English loanwords/brand-names/quotes, etc.

Personally I'd prefer a 1-to-4 scale via icons or a similar symbolic indicator to indicate estimated translation quality; a percentage we render will never agree with Transifex's percentage and by definition cannot be accurate. I'd go for a set of "Translation Coverage" states like so:

  1. Sparse (0-40%)
  2. Partial (40-70%)
  3. Acceptable (70-90%)
  4. Satisfactory (90-100%)

Likely the last state would not be rendered, so a three-state icon or indicator could be used to indicate "this translation has issues" and its absence indicate the translation is of satisfactory quality.

Comment thread contrib/json/json.hpp Outdated
Comment thread src/Lang.cpp Outdated
Comment thread src/Lang.cpp
Comment on lines +246 to +247
static std::map<std::string, int> s_languageCompletion;
static bool s_languageCompletionScanned = false;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: we prefer that file-scope static variables are typically defined at the top of the file between #include statements and the first function body.

@sturnclaw

Copy link
Copy Markdown
Member

Regarding hiding languages from the list: please do not do so. Present-but-low-quality is a different state from unsupported. The goal should be to present to the user that their language is supported, and their contributions are wanted. Hiding a language prevents the user from knowing that they can help to improve it.

(It also means we will eventually get confused users here, opening an issue asking for their language to be added to the game when it already is.)

@zonkmachine

Copy link
Copy Markdown
Member

Regarding hiding languages from the list: please do not do so. Present-but-low-quality is a different state from unsupported. The goal should be to present to the user that their language is supported, and their contributions are wanted. Hiding a language prevents the user from knowing that they can help to improve it.

The only translations that are omitted through this PR are the ones that are not translated at all. Such as a newly introduced language that has no translations at all, causing it to turn up as "English" in the language list. This just leads to confusion. Or we only accept a new language if they give us a language name to register it in?
So we can pre-fill LANG_NAME in /data/lang/core/XX.json ?

@sam52796 sam52796 force-pushed the Translation_percent branch from a193987 to 117e419 Compare June 25, 2026 03:35
@sam52796 sam52796 force-pushed the Translation_percent branch from 117e419 to af36b4e Compare June 25, 2026 03:37
@sturnclaw

Copy link
Copy Markdown
Member

Or we only accept a new language if they give us a language name to register it in? So we can pre-fill LANG_NAME in /data/lang/core/XX.json ?

This would be preferable. A hidden language in the list is a language that users are not aware the game can be translated into. (In more readable terms: if it's hidden, no one will translate anything into that language). Ideally the solution is that we provide a machine-translated LANG_NAME for any extant translations which are missing it, and impose the requirement that a language must have at least its own name translated to be added to Transifex.

@sam52796

Copy link
Copy Markdown
Contributor Author

I've fixed the code issues, with the exception of moving the static variable declarations to the top of the file. I noticed that every other existing static variable declaration in that file lives just above the function that uses it. I'm guessing it's because the file follows more of a "collection of static helper functions" pattern than an "object with fields and methods" pattern, so it seems better not to break the pattern.

I'm also going to push back on the suggestion of having buckets instead of percentages. It seems like it's going to create more work for everyone without much benefit. Someone will have to create the icons, and then we should also show an explanation of what the icons mean - which would have to be translated into every language, with the explanation changing when the user clicks on a language. This would require special case coding because all other language changes require restarting the game to take effect. In the end... why don't we just show a percentage instead?

In case you are still worried about false negatives, I checked the Chinese language file, currently showing as 88%, and roughly half of the "still English" strings are mission dialog messages, which definitely should be translated. So someone would only have to finish some of those to bring it up to the 90% level beyond which no percentage is shown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants