Skip to content

Conversation

@MrLebjane
Copy link

This is a stemmer for Sesotho (Southern Sotho), a Bantu language spoken in South Africa and Lesotho.

full_build.log Outdated
@@ -0,0 +1,138 @@
libstemmer/mkalgorithms.pl algorithms.mk libstemmer/modules.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log shouldn't be in git.

do remove_nominal_suffixes
do remove_verb_suffixes
do remove_noun_prefixes
) No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor nit, but please include a newline character on the final line (github's red icon here means there isn't one). It's better to include as some text processing tools can behave unhelpfully without one (and some editors will add automatically add one which can create noise in future PRs).

tamil UTF_8 tamil,ta,tam
turkish UTF_8 turkish,tr,tur
yiddish UTF_8 yiddish,yi,yid
sesotho UTF_8 sesotho,st,sot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging from the stemmer code, this language seems to use only ASCII characters so we can put UTF_8,ISO_8859_1 here.

@ojwb
Copy link
Member

ojwb commented Dec 6, 2025

Thanks for submitting this. There's a bit of a queue of stemmers waiting to be reviewed currently but I'll at least do a quick initial review. The CI failures look to be due to the test data - I'll comment on the snowball-data PR about that.

[substring] among(
'nyana' /* diminutive form */
'ana' /* diminutive form */
'ano'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change intended? (Asking because it wasn't covered by the commit message)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was intentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants