Skip to content

Model translategemma 4b it q4_k_m

hydropix edited this page Jan 17, 2026 · 1 revision

translategemma:4b-it-q4_K_M

Ollama Model ID: translategemma:4b-it-q4_K_M


Summary

Metric Value
Average Score 🟠 6.0/10
Accuracy 6.5/10
Fluency 6.2/10
Style 5.2/10
Languages Tested 19
Total Translations 95
Best Language French (6.8)
Worst Language Hebrew (5.0)

Language Performance

Top Languages

Rank Language Overall Accuracy Fluency Style
1 French 🟠 6.8 6.8 7.4 5.8
2 Spanish 🟠 6.8 6.8 7.2 5.8
3 Portuguese 🟠 6.6 7.0 7.0 5.6
4 Vietnamese 🟠 6.6 7.0 6.8 5.8
5 Arabic 🟠 6.6 7.0 6.8 6.0
6 Russian 🟠 6.4 6.6 6.6 5.4
7 German 🟠 6.2 6.8 6.2 5.6
8 Italian 🟠 6.2 6.4 6.8 5.2
9 Ukrainian 🟠 6.2 7.0 6.2 5.6
10 Chinese (Simplified) 🟠 6.0 6.6 6.0 5.4
View all 19 languages
Rank Language Overall Accuracy Fluency Style
1 French 🟠 6.8 6.8 7.4 5.8
2 Spanish 🟠 6.8 6.8 7.2 5.8
3 Portuguese 🟠 6.6 7.0 7.0 5.6
4 Vietnamese 🟠 6.6 7.0 6.8 5.8
5 Arabic 🟠 6.6 7.0 6.8 6.0
6 Russian 🟠 6.4 6.6 6.6 5.4
7 German 🟠 6.2 6.8 6.2 5.6
8 Italian 🟠 6.2 6.4 6.8 5.2
9 Ukrainian 🟠 6.2 7.0 6.2 5.6
10 Chinese (Simplified) 🟠 6.0 6.6 6.0 5.4
11 Chinese (Traditional) 🟠 5.8 6.4 5.8 4.8
12 Korean 🟠 5.6 6.4 5.6 5.0
13 Thai 🟠 5.6 6.4 5.6 5.2
14 Hindi 🟠 5.6 6.4 5.6 5.0
15 Bengali 🟠 5.6 6.4 5.6 5.0
16 Polish 🟠 5.4 5.8 6.0 4.8
17 Japanese 🟠 5.2 5.8 5.6 4.4
18 Tamil 🟠 5.0 6.0 5.0 4.4
19 Hebrew 🟠 5.0 5.6 5.2 4.2

Performance by Category

European Major Languages

Language Overall Accuracy Fluency Style
French 🟠 6.8 6.8 7.4 5.8
Spanish 🟠 6.8 6.8 7.2 5.8
Portuguese 🟠 6.6 7.0 7.0 5.6
German 🟠 6.2 6.8 6.2 5.6
Italian 🟠 6.2 6.4 6.8 5.2
Polish 🟠 5.4 5.8 6.0 4.8

Category Average: 🟠 6.3

Asian Languages

Language Overall Accuracy Fluency Style
Vietnamese 🟠 6.6 7.0 6.8 5.8
Chinese (Simplified) 🟠 6.0 6.6 6.0 5.4
Chinese (Traditional) 🟠 5.8 6.4 5.8 4.8
Korean 🟠 5.6 6.4 5.6 5.0
Thai 🟠 5.6 6.4 5.6 5.2
Hindi 🟠 5.6 6.4 5.6 5.0
Bengali 🟠 5.6 6.4 5.6 5.0
Japanese 🟠 5.2 5.8 5.6 4.4
Tamil 🟠 5.0 6.0 5.0 4.4

Category Average: 🟠 5.7

Semitic Languages

Language Overall Accuracy Fluency Style
Arabic 🟠 6.6 7.0 6.8 6.0
Hebrew 🟠 5.0 5.6 5.2 4.2

Category Average: 🟠 5.8

Cyrillic Languages

Language Overall Accuracy Fluency Style
Russian 🟠 6.4 6.6 6.6 5.4
Ukrainian 🟠 6.2 7.0 6.2 5.6

Category Average: 🟠 6.3


Translation Examples

Best Translation

French - Pride and Prejudice Score: 🟡 8.0/10

View Translation

Original:

Translation:

Il est un fait universellement reconnu qu'un homme célibataire, disposant d'une somme importante, doit nécessairement être à la recherche d'une épouse. Quoique les sentiments ou les opinions de tel homme puissent être méconnus lorsqu'il s'installe dans un nouveau quartier, cette vérité est si profondément ancrée dans l'esprit des familles environnantes qu'il est considéré comme appartenant à l'une d'elles.

Feedback: The translation accurately conveys the meaning and maintains Austen's ironic tone effectively. Minor stylistic differences include 'quartier' (neighborhood/district) being slightly less grand than the original's social implications, and the final phrase loses some of the original's biting wit about daughters being 'property.'

Worst Translation

Polish - The Picture of Dorian Gray Score: 🔴 4.0/10

View Translation

Original:

Translation:

Studio wypełnione było intensywnym zapachem róż, a gdy ciepły letni wiatr unosił się między drzewami w ogrodzie, przez otwarte drzwi do środka docierał obfity zapach lilii lub delikatniejszy aromat róż. Z rogu sofą z perskich skóry, na której leżał, tycząc, jak to zwykle robił, niezliczoną ilość papierosów, Lord Henry Wotton mógł ledwo dostrzec połysk słodkich, miodowych kwiatów jałowca.

Feedback: The translation contains significant errors that undermine both meaning and style: 'lilii' (lilies) should be 'bzu' (lilac), 'róż' (roses) misrepresents 'pink-flowering thorn', 'sofą z perskich skóry' is grammatically awkward and unclear, 'tycząc' is incorrect (should relate to smoking), and 'jałowca' (juniper) completely misidentifies 'laburnum'. The aesthetic precision and sensory richness of Wilde's prose are substantially lost.


Score Distribution

Score Range Count Percentage
🟢 Excellent (9-10) 0 0.0%
🟡 Good (7-8) 33 34.7%
🟠 Acceptable (5-6) 54 56.8%
🔴 Poor (3-4) 8 8.4%
⚫ Failed (1-2) 0 0.0%

Performance Metrics

Metric Value
Average Translation Time 1060.0ms
Success Rate 100.0%

← Back to Home | All Models

Clone this wiki locally