Skip to content

Comments

feat: add words from fantasy, politics, religion, and adverbs#2643

Open
rauletaveras wants to merge 9 commits intoAutomattic:masterfrom
rauletaveras:second_contribution
Open

feat: add words from fantasy, politics, religion, and adverbs#2643
rauletaveras wants to merge 9 commits intoAutomattic:masterfrom
rauletaveras:second_contribution

Conversation

@rauletaveras
Copy link
Contributor

@rauletaveras rauletaveras commented Feb 2, 2026

Description

This was effectively a sweep of my Obsidian vaults, particularly my worldbuilding vault where I engage in some philosophical analysis. Added words I felt were common enough to merit going in the dictionary. I'm still not sure whether the linguistics terms are widely-used enough. Happy to make any amends.

How Has This Been Tested?

Run cargo test and follow instructions (so far)

Copy link
Collaborator

@hippietrail hippietrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. I made a few notes for things to add or change.

I would put all the ones at the end in alphabetical order too. Makes them much easier to scan quickly.

inheritor/NSg
inhibit/~VGSd
inhibition/~NSg
inhibition/~NSgE # Add preffix "dis-" for "disinhibition"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo in your comment: s/preffix/prefix/

minstrel/~NSgV
minstrelsy/Ng
mint/~NgSVd>GJZ
mint/~NgSVd>GJZU # Add prefix un- for "unmint", "unminted", etc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find much evidence for "unmint" but "unminted" seems legit. In such cases we just make entries for the derived form directly.

pidgin/~NgS
pie/~NwSgV
piebald/JNgS
piebaldism/Ng # TODO: Check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the OED

viscomital/J # As above, alternative version
lingua franca/N # TODO
linguas francas/N # TODO
linguae francae/N # TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linguæ francæ is also a legit speling

idiolect/N # TODO -- term in linguistics for the variety of language spoken by an individual as compared to their broader dialect and language
sociolect/N # TODO -- as above, but for a variety associated with a social class
luthier/N # TODO -- person who makes/repairs string instruments, mainly violins
ASMR/NOg # TODO -- "Autonomous Sensory Meridian Response"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say ASMR is a proper noun. It's not a person, company, country, etc.

Westphalian/J # political science and international relations
archipelagic/J # As found in the phrase "Archipelagic state" in the "United Nations Convention on the Law of the Sea"
indefinitely/Ry
polearm/N # TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs plural and possessive

indefinitely/Ry
polearm/N # TODO
kalimba/N # TODO
subvocalization/Ng # phonetics TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs plural

polearm/N # TODO
kalimba/N # TODO
subvocalization/Ng # phonetics TODO
synesthesia/Ng # TODO -- check also add "synaesthesia"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "synaesthesia" is legit, and "synæsthesia" too

subvocalization/Ng # phonetics TODO
synesthesia/Ng # TODO -- check also add "synaesthesia"
menarche/Ng # medicine TODO
proprioception/Ng # medicine TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add /m for mass noun

menarche/Ng # medicine TODO
proprioception/Ng # medicine TODO

--- Split these to their own PR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought some of these were already in, but maybe they were in my rejected PR to be counted as single tokens when the constituent words are not English words in their own right.

@rauletaveras
Copy link
Contributor Author

Looks pretty good. I made a few notes for things to add or change.

Thank you for your very quick response and your suggestions

@rauletaveras rauletaveras marked this pull request as ready for review February 5, 2026 22:39
Copy link
Collaborator

@hippietrail hippietrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of interesting words to go through, including more than a few that are new to me.

epaulette/NgS!@_₹
epee/NgS
epenthesis/Nm # TODO: not count noun
epenthesis/N
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a mass noun and when it's countable it has an irregular plural:

epentheses/N9
epenthesis/N0w

inheritor/NSg
inhibit/~VGSd
inhibition/~NSg
inhibition/~NSgE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inhibition is both a mass noun and a count noun too so add /w

fae/Nmg # fantasy alternative for "fey"
hewn/JT # past participle of "hew"
Hmong-Mien/NgJ # linguistics
Iaido/NOg # martial art
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iaido doesn't seem to be a proper noun or need to be capitalized from a quick look around?

linguæ francæ/9g
longsword/NSg
luthier/NSg
menarche/Ng # medicine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a mass noun /m

Nilotic/NgJ # linguistics
offeror/NSg
orthopractic/J
orthopraxy/NSg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orthopraxy seems to have mass noun and countable senses so /w

polysynthetic/J # linguistics
praxis/Ng
proprioception/Ng # medicine
queenship/Ng
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treating queenship as both mass and count seems best /w

shortsword/NSg
Sino-Tibetan/NgJ # linguistics
sociolect/NSg
sortition/Ng # Political science: a method of appointment to office by random draw
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a /m mass noun to me. Abstract nouns are generally mass nouns.

sociolect/NSg
sortition/Ng # Political science: a method of appointment to office by random draw
subvocalization/Nwg # phonetics
synaesthesia/Ng
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a mass noun

synchronic/JQ # linguistics
syncretism/Ng
synesthesia/Ng
synæsthesia/Ng
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three are also mass nouns.

synesthesia/Ng
synæsthesia/Ng
trimeter/N # poetry
trimetre/N!@_₹ # poetry
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem to have plurals. Nouns that don't end in s but don't have a possessive are pretty rare so I'd add /gS to them.

@rauletaveras
Copy link
Contributor Author

Thank you kindly. I see the convention a bit better now. I particularly hadn't put together that "mass nouns" and "non-count nouns" are equivalent.

Copy link
Collaborator

@hippietrail hippietrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you miss one?

Everything else looks spot on.

polysynthetic/J # linguistics
praxis/Ng
praxis/Nmg
proprioception/Ng # medicine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you skip proprioception by accident?

@hippietrail
Copy link
Collaborator

Thank you kindly. I see the convention a bit better now. I particularly hadn't put together that "mass nouns" and "non-count nouns" are equivalent.

Yes they're not really natural concepts either way unless you're into grammar or linguistics, or have learned a taught a foreign language. And terminology can vary. I try to mix up the terms I use to cover all bases.

Copy link
Collaborator

@hippietrail hippietrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope my explanations are not confusing. Let me know.

Indo-Aryan/N0gJ # linguistics
Indo-European/N0gJ # linguistics
Indo-Iranian/N0gJ # linguistics
Japonic/N0gJ # linguistics
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't put /0 for these. That means it's singular, which implies it has a countable sense. But it's easy to test that language names are mass nouns in their primary sense because they don't need an indefinite article:

  • Do you drive car? ❌
  • Do you drive a car? ✅
  • Do you eat meat? ✅
  • Do you eat a meat? ❌
  • Do you study Japonic? ✅
  • Do you study a Japonic? ❌

In fact /0 only exists because I wanted singular countable to be the default since the original dictionary format didn't properly account for them as it was designed for a spell checker dictionary.

But defaults get overridden if other properties are set. So I made /9 for words which are specifically plural but then if a word was both singular and plural like "biceps" or "sheep", setting plural would override the default singular. So I made /0 to allow overriding it back again.

So technically /0 is only really needed for words which also need /9 but when I went through marking up all the irregular plurals I found it helpful to mark the singular forms of those with /0 for clarity to see that they go with a nearby word with /9 even though the /0 is not technically required.

Let me know if that doesn't make sense and I'll try to find another way to word it (-:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much and sorry for rushing unnecessarily.

linguae francae/9g
linguæ francæ/9g
linguae francae/N9g
linguæ francæ/N9g
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are irregular plurals that I invented /9 for so using /0 on lingua franca above would be consistent with what I was talking about in the other comment, unlike the ones on the names of the language families.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left S on lingua franca because the plural "linguas francas" is attested. Thank you for your patience.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left S on lingua franca because the plural "linguas francas" is attested. Thank you for your patience.

/S means "generate a regular plural form for this entry" which will result in "lingua francas" instead of "linguas francas" because Harper only has a few rules for variations of plural endings. You can see them in annotations.json. Native English two-word terms still mostly just pluralize the last word. Anything more exotic needs a manually crafted plural entry.

You can test the affixing annotations on the commandline like this:
just getforms lingua franca

Sorry I didn't spot this in my previous reviews.

linguae francae/9g
linguæ francæ/9g
linguae francae/N9g
linguæ francæ/N9g
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left S on lingua franca because the plural "linguas francas" is attested. Thank you for your patience.

/S means "generate a regular plural form for this entry" which will result in "lingua francas" instead of "linguas francas" because Harper only has a few rules for variations of plural endings. You can see them in annotations.json. Native English two-word terms still mostly just pluralize the last word. Anything more exotic needs a manually crafted plural entry.

You can test the affixing annotations on the commandline like this:
just getforms lingua franca

Sorry I didn't spot this in my previous reviews.

@rauletaveras
Copy link
Contributor Author

Whoops, failure in my explanation as Spanish thinking creeped in. The attested plural is indeed "lingua francas" (see the second paragraph on the relevant Wikipedia article) as generated by /S.

@hippietrail
Copy link
Collaborator

Whoops, failure in my explanation as Spanish thinking creeped in. The attested plural is indeed "lingua francas" (see the second paragraph on the relevant Wikipedia article) as generated by /S.

Huh OK I know I've looked into that before but I can't remember. I think you're right. I also probably recognize "linguas francas" from Spanish. Yep "lingua francas" counts so /S is the right thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants