Skip to content

Help Wanted: Curate Hebrew (עברית) Word List #110

@Hugo0

Description

@Hugo0

The Problem

The Hebrew word list currently contains 36,675 words - far more than other languages (Finnish has ~3,000, German ~2,200). This is because the list includes nearly every verb conjugation and inflected form, making many daily words feel unfair or impossible to guess.

Examples of problematic words:

  • הלטתן - "you (feminine plural) whipped"
  • לפפכן - "you (feminine plural) wrapped"
  • נתפזר - "we will be scattered"

These are grammatically valid but extremely obscure forms that even native speakers would struggle to guess.

Why This Matters

Our analytics show that word quality is the #1 user complaint (50% of feedback). Languages with curated word lists (like Finnish) have significantly better retention rates.

What We Need

A curated Hebrew word list of approximately 3,000-5,000 common words including:

  • ✅ Common nouns (בית, ספר, שולחן)
  • ✅ Common adjectives (גדול, קטן, יפה)
  • ✅ Common verbs in base/infinitive form
  • ✅ Words that native speakers would reasonably guess

Not including:

  • ❌ Obscure verb conjugations with pronoun suffixes
  • ❌ Rare grammatical forms
  • ❌ Proper nouns (names, places)
  • ❌ Foreign words

How to Contribute

  1. Option A: Review the current word list at webapp/data/languages/he/he_5words.txt and create a blocklist of words to remove at webapp/data/languages/he/he_blocklist.txt

  2. Option B: Suggest a better source for Hebrew words (e.g., frequency lists from Hebrew Wikipedia, news corpora, or the Academy of the Hebrew Language)

  3. Option C: Create a new curated word list from scratch

Technical Notes

  • Words must be exactly 5 Hebrew letters
  • Final forms (sofit) should be used correctly: ך, ם, ן, ף, ץ at word end
  • No niqqud (vowel marks) - just consonants
  • One word per line

Resources

  • Current word list: webapp/data/languages/he/he_5words.txt
  • Blocklist location: webapp/data/languages/he/he_blocklist.txt
  • Curation script: scripts/curate_words.py

Thank you for helping make Wordle Global better for Hebrew speakers! 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions