feat: add words from fantasy, politics, religion, and adverbs#2643
feat: add words from fantasy, politics, religion, and adverbs#2643rauletaveras wants to merge 9 commits intoAutomattic:masterfrom
Conversation
hippietrail
left a comment
There was a problem hiding this comment.
Looks pretty good. I made a few notes for things to add or change.
I would put all the ones at the end in alphabetical order too. Makes them much easier to scan quickly.
harper-core/dictionary.dict
Outdated
| inheritor/NSg | ||
| inhibit/~VGSd | ||
| inhibition/~NSg | ||
| inhibition/~NSgE # Add preffix "dis-" for "disinhibition" |
There was a problem hiding this comment.
Minor typo in your comment: s/preffix/prefix/
harper-core/dictionary.dict
Outdated
| minstrel/~NSgV | ||
| minstrelsy/Ng | ||
| mint/~NgSVd>GJZ | ||
| mint/~NgSVd>GJZU # Add prefix un- for "unmint", "unminted", etc. |
There was a problem hiding this comment.
I can't find much evidence for "unmint" but "unminted" seems legit. In such cases we just make entries for the derived form directly.
harper-core/dictionary.dict
Outdated
| pidgin/~NgS | ||
| pie/~NwSgV | ||
| piebald/JNgS | ||
| piebaldism/Ng # TODO: Check |
harper-core/dictionary.dict
Outdated
| viscomital/J # As above, alternative version | ||
| lingua franca/N # TODO | ||
| linguas francas/N # TODO | ||
| linguae francae/N # TODO |
There was a problem hiding this comment.
linguæ francæ is also a legit speling
harper-core/dictionary.dict
Outdated
| idiolect/N # TODO -- term in linguistics for the variety of language spoken by an individual as compared to their broader dialect and language | ||
| sociolect/N # TODO -- as above, but for a variety associated with a social class | ||
| luthier/N # TODO -- person who makes/repairs string instruments, mainly violins | ||
| ASMR/NOg # TODO -- "Autonomous Sensory Meridian Response" |
There was a problem hiding this comment.
I wouldn't say ASMR is a proper noun. It's not a person, company, country, etc.
harper-core/dictionary.dict
Outdated
| Westphalian/J # political science and international relations | ||
| archipelagic/J # As found in the phrase "Archipelagic state" in the "United Nations Convention on the Law of the Sea" | ||
| indefinitely/Ry | ||
| polearm/N # TODO |
There was a problem hiding this comment.
Needs plural and possessive
harper-core/dictionary.dict
Outdated
| indefinitely/Ry | ||
| polearm/N # TODO | ||
| kalimba/N # TODO | ||
| subvocalization/Ng # phonetics TODO |
harper-core/dictionary.dict
Outdated
| polearm/N # TODO | ||
| kalimba/N # TODO | ||
| subvocalization/Ng # phonetics TODO | ||
| synesthesia/Ng # TODO -- check also add "synaesthesia" |
There was a problem hiding this comment.
I think "synaesthesia" is legit, and "synæsthesia" too
harper-core/dictionary.dict
Outdated
| subvocalization/Ng # phonetics TODO | ||
| synesthesia/Ng # TODO -- check also add "synaesthesia" | ||
| menarche/Ng # medicine TODO | ||
| proprioception/Ng # medicine TODO |
harper-core/dictionary.dict
Outdated
| menarche/Ng # medicine TODO | ||
| proprioception/Ng # medicine TODO | ||
|
|
||
| --- Split these to their own PR |
There was a problem hiding this comment.
I thought some of these were already in, but maybe they were in my rejected PR to be counted as single tokens when the constituent words are not English words in their own right.
Thank you for your very quick response and your suggestions |
hippietrail
left a comment
There was a problem hiding this comment.
A lot of interesting words to go through, including more than a few that are new to me.
harper-core/dictionary.dict
Outdated
| epaulette/NgS!@_₹ | ||
| epee/NgS | ||
| epenthesis/Nm # TODO: not count noun | ||
| epenthesis/N |
There was a problem hiding this comment.
This is mostly a mass noun and when it's countable it has an irregular plural:
epentheses/N9
epenthesis/N0w
harper-core/dictionary.dict
Outdated
| inheritor/NSg | ||
| inhibit/~VGSd | ||
| inhibition/~NSg | ||
| inhibition/~NSgE |
There was a problem hiding this comment.
inhibition is both a mass noun and a count noun too so add /w
harper-core/dictionary.dict
Outdated
| fae/Nmg # fantasy alternative for "fey" | ||
| hewn/JT # past participle of "hew" | ||
| Hmong-Mien/NgJ # linguistics | ||
| Iaido/NOg # martial art |
There was a problem hiding this comment.
iaido doesn't seem to be a proper noun or need to be capitalized from a quick look around?
harper-core/dictionary.dict
Outdated
| linguæ francæ/9g | ||
| longsword/NSg | ||
| luthier/NSg | ||
| menarche/Ng # medicine |
There was a problem hiding this comment.
This seems to be a mass noun /m
harper-core/dictionary.dict
Outdated
| Nilotic/NgJ # linguistics | ||
| offeror/NSg | ||
| orthopractic/J | ||
| orthopraxy/NSg |
There was a problem hiding this comment.
orthopraxy seems to have mass noun and countable senses so /w
harper-core/dictionary.dict
Outdated
| polysynthetic/J # linguistics | ||
| praxis/Ng | ||
| proprioception/Ng # medicine | ||
| queenship/Ng |
There was a problem hiding this comment.
Treating queenship as both mass and count seems best /w
harper-core/dictionary.dict
Outdated
| shortsword/NSg | ||
| Sino-Tibetan/NgJ # linguistics | ||
| sociolect/NSg | ||
| sortition/Ng # Political science: a method of appointment to office by random draw |
There was a problem hiding this comment.
Looks like a /m mass noun to me. Abstract nouns are generally mass nouns.
harper-core/dictionary.dict
Outdated
| sociolect/NSg | ||
| sortition/Ng # Political science: a method of appointment to office by random draw | ||
| subvocalization/Nwg # phonetics | ||
| synaesthesia/Ng |
harper-core/dictionary.dict
Outdated
| synchronic/JQ # linguistics | ||
| syncretism/Ng | ||
| synesthesia/Ng | ||
| synæsthesia/Ng |
There was a problem hiding this comment.
These three are also mass nouns.
harper-core/dictionary.dict
Outdated
| synesthesia/Ng | ||
| synæsthesia/Ng | ||
| trimeter/N # poetry | ||
| trimetre/N!@_₹ # poetry |
There was a problem hiding this comment.
These seem to have plurals. Nouns that don't end in s but don't have a possessive are pretty rare so I'd add /gS to them.
|
Thank you kindly. I see the convention a bit better now. I particularly hadn't put together that "mass nouns" and "non-count nouns" are equivalent. |
hippietrail
left a comment
There was a problem hiding this comment.
Did you miss one?
Everything else looks spot on.
harper-core/dictionary.dict
Outdated
| polysynthetic/J # linguistics | ||
| praxis/Ng | ||
| praxis/Nmg | ||
| proprioception/Ng # medicine |
There was a problem hiding this comment.
Did you skip proprioception by accident?
Yes they're not really natural concepts either way unless you're into grammar or linguistics, or have learned a taught a foreign language. And terminology can vary. I try to mix up the terms I use to cover all bases. |
hippietrail
left a comment
There was a problem hiding this comment.
I hope my explanations are not confusing. Let me know.
harper-core/dictionary.dict
Outdated
| Indo-Aryan/N0gJ # linguistics | ||
| Indo-European/N0gJ # linguistics | ||
| Indo-Iranian/N0gJ # linguistics | ||
| Japonic/N0gJ # linguistics |
There was a problem hiding this comment.
I wouldn't put /0 for these. That means it's singular, which implies it has a countable sense. But it's easy to test that language names are mass nouns in their primary sense because they don't need an indefinite article:
- Do you drive car? ❌
- Do you drive a car? ✅
- Do you eat meat? ✅
- Do you eat a meat? ❌
- Do you study Japonic? ✅
- Do you study a Japonic? ❌
In fact /0 only exists because I wanted singular countable to be the default since the original dictionary format didn't properly account for them as it was designed for a spell checker dictionary.
But defaults get overridden if other properties are set. So I made /9 for words which are specifically plural but then if a word was both singular and plural like "biceps" or "sheep", setting plural would override the default singular. So I made /0 to allow overriding it back again.
So technically /0 is only really needed for words which also need /9 but when I went through marking up all the irregular plurals I found it helpful to mark the singular forms of those with /0 for clarity to see that they go with a nearby word with /9 even though the /0 is not technically required.
Let me know if that doesn't make sense and I'll try to find another way to word it (-:
There was a problem hiding this comment.
Thank you very much and sorry for rushing unnecessarily.
| linguae francae/9g | ||
| linguæ francæ/9g | ||
| linguae francae/N9g | ||
| linguæ francæ/N9g |
There was a problem hiding this comment.
These are irregular plurals that I invented /9 for so using /0 on lingua franca above would be consistent with what I was talking about in the other comment, unlike the ones on the names of the language families.
There was a problem hiding this comment.
Left S on lingua franca because the plural "linguas francas" is attested. Thank you for your patience.
There was a problem hiding this comment.
Left
Sonlingua francabecause the plural "linguas francas" is attested. Thank you for your patience.
/S means "generate a regular plural form for this entry" which will result in "lingua francas" instead of "linguas francas" because Harper only has a few rules for variations of plural endings. You can see them in annotations.json. Native English two-word terms still mostly just pluralize the last word. Anything more exotic needs a manually crafted plural entry.
You can test the affixing annotations on the commandline like this:
just getforms lingua franca
Sorry I didn't spot this in my previous reviews.
| linguae francae/9g | ||
| linguæ francæ/9g | ||
| linguae francae/N9g | ||
| linguæ francæ/N9g |
There was a problem hiding this comment.
Left
Sonlingua francabecause the plural "linguas francas" is attested. Thank you for your patience.
/S means "generate a regular plural form for this entry" which will result in "lingua francas" instead of "linguas francas" because Harper only has a few rules for variations of plural endings. You can see them in annotations.json. Native English two-word terms still mostly just pluralize the last word. Anything more exotic needs a manually crafted plural entry.
You can test the affixing annotations on the commandline like this:
just getforms lingua franca
Sorry I didn't spot this in my previous reviews.
|
Whoops, failure in my explanation as Spanish thinking creeped in. The attested plural is indeed "lingua francas" (see the second paragraph on the relevant Wikipedia article) as generated by |
Huh OK I know I've looked into that before but I can't remember. I think you're right. I also probably recognize "linguas francas" from Spanish. Yep "lingua francas" counts so |
Description
This was effectively a sweep of my Obsidian vaults, particularly my worldbuilding vault where I engage in some philosophical analysis. Added words I felt were common enough to merit going in the dictionary. I'm still not sure whether the linguistics terms are widely-used enough. Happy to make any amends.
How Has This Been Tested?
Run
cargo testand follow instructions (so far)