diff --git a/README.md b/README.md index 7e448bf7..c1db1e86 100644 --- a/README.md +++ b/README.md @@ -14,3 +14,24 @@ Language/region/script definitions and the `gflanguages` modules are used as a s This module is the main place to update these definitions, avoiding data duplication and guaranteeing uniformity across tools. To learn more about how *lang* metadata affects downstream, see [gf-guide/lang](https://googlefonts.github.io/gf-guide/lang). + +## Sample text rules + +If there is a `sample_text` field for a language, it should contain all of the following fields: + +* `masthead_full`: show off four glyphs +* `masthead_partial`: show off two glyphs +* `styles`: a phrase of 40-60 characters +* `tester`: a phrase of 60-90 characters +* `poster_sm`: a word or phrase of 10-17 characters +* `poster_md`: a word or phrase of 6-12 characters +* `poster_lg`: a word or phrase of 3-8 characters +* `specimen_48`: a sentence of 50-80 characters +* `specimen_36`: a paragraph of 100-120 characters +* `specimen_32`: a paragraph of 140-180 characters +* `specimen_21`: one or more paragraphs totalling 300-500 characters +* `specimen_16`: one or more paragraphs totalling 550-750 characters + +Generally the sample text should be taken from the UN Declaration of Human Rights; if using Eric Muller's XML translations, `snippets/lang_sample_text.py` will convert the XML into textproto. + +If the UDHR is not available in the language, the sample text should be a "neutral" text (not political or religious) - folk tales are generally good sources. (We recognise that for some liturgical languages, religious texts may be the only extant samples.) In these cases, please add a `note:` field with the source of the sample text.