Files

.github
bootstrap
- README
- bootstrap.hs
- build.hs
- images.hs
- mktranslit.hs
- names.py
- panlex.sql
- semcor.py
- spreadsheet.py
- taxonomy.hs
- translit.txt
- wn30map31.txt
geanyplugin
gf
morphodicts
python
train
www-services
www
.gitignore
CatAPI.gf
CatZero.gf
Makefile
ParadigmsZero.gf
Parse.gf
Parse.labels
Parse.udlabels
ParseAPI.gf
ParseAar.gf
ParseAbk.gf
ParseAce.gf
ParseAdy.gf
ParseAfr.gf
ParseAls.gf
ParseAlt.gf
ParseAmh.gf
ParseAng.gf
ParseAra.gf
ParseArc.gf
ParseArg.gf
ParseAry.gf
ParseArz.gf
ParseAsm.gf
ParseAst.gf
ParseAva.gf
ParseAym.gf
ParseAzb.gf
ParseAzj.gf
ParseBak.gf
ParseBam.gf
ParseBan.gf
ParseBar.gf
ParseBcl.gf
ParseBel.gf
ParseBen.gf
ParseBis.gf
ParseBjn.gf
ParseBod.gf
ParseBos.gf
ParseBre.gf
ParseBul.gf
ParseBxr.gf
ParseCat.gf
ParseCeb.gf
ParseCha.gf
ParseChe.gf
ParseChi.gf
ParseCho.gf
ParseChr.gf
ParseChu.gf
ParseChv.gf
ParseChy.gf
ParseCkb.gf
ParseCor.gf
ParseCos.gf
ParseCrh.gf
ParseCsb.gf
ParseCym.gf
ParseCze.gf
ParseDan.gf
ParseDiq.gf
ParseDiv.gf
ParseDsb.gf
ParseDut.gf
ParseDzo.gf
ParseEng.gf
ParseEng.udlabels
ParseEpo.gf
ParseEst.gf
ParseEus.gf
ParseEwe.gf
ParseExt.gf
ParseExtend.gf
ParseExtendAfr.gf
ParseExtendAra.gf
ParseExtendBul.gf
ParseExtendCat.gf
ParseExtendChi.gf
ParseExtendDan.gf
ParseExtendDut.gf
ParseExtendEng.gf
ParseExtendEst.gf
ParseExtendFin.gf
ParseExtendFre.gf
ParseExtendGer.gf
ParseExtendHin.gf
ParseExtendIce.gf
ParseExtendIna.gf
ParseExtendIta.gf
ParseExtendKor.gf
ParseExtendLav.gf
ParseExtendMlt.gf
ParseExtendMon.gf
ParseExtendNno.gf
ParseExtendNor.gf
ParseExtendPol.gf
ParseExtendPor.gf
ParseExtendRon.gf
ParseExtendRus.gf
ParseExtendSlv.gf
ParseExtendSom.gf
ParseExtendSpa.gf
ParseExtendSqi.gf
ParseExtendSwa.gf
ParseExtendSwe.gf
ParseExtendTha.gf
ParseExtendTur.gf
ParseExtendUrd.gf
ParseExtendZul.gf
ParseFao.gf
ParseFij.gf
ParseFin.gf
ParseFrc.gf
ParseFre.gf
ParseFrp.gf
ParseFrr.gf
ParseFry.gf
ParseFur.gf
ParseGag.gf
ParseGan.gf
ParseGcr.gf
ParseGer.gf
ParseGla.gf
ParseGle.gf
ParseGlg.gf
ParseGlv.gf
ParseGot.gf
ParseGrc.gf
ParseGre.gf
ParseGsw.gf
ParseGuj.gf
ParseHak.gf
ParseHat.gf
ParseHau.gf
ParseHaw.gf
ParseHeb.gf
ParseHin.gf
ParseHrv.gf
ParseHsb.gf
ParseHun.gf
ParseHye.gf
ParseIbo.gf
ParseIce.gf
ParseIdo.gf
ParseIii.gf
ParseIku.gf
ParseIle.gf
ParseIlo.gf
ParseIna.gf
ParseInd.gf
ParseInh.gf
ParseIta.gf
ParseJam.gf
ParseJav.gf
ParseJbo.gf
ParseJpn.gf
ParseKaa.gf
ParseKab.gf
ParseKal.gf
ParseKan.gf
ParseKat.gf
ParseKau.gf
ParseKaz.gf
ParseKbd.gf
ParseKcg.gf
ParseKhm.gf
ParseKik.gf
ParseKin.gf
ParseKir.gf
ParseKoi.gf
ParseKor.gf
ParseKpv.gf
ParseKrc.gf
ParseKsh.gf
ParseKur.gf
ParseLad.gf
ParseLao.gf
ParseLat.gf
ParseLav.gf
ParseLbe.gf
ParseLez.gf
ParseLfn.gf
ParseLij.gf
ParseLim.gf
ParseLin.gf
ParseLit.gf
ParseLld.gf
ParseLmo.gf
ParseLtg.gf
ParseLtz.gf
ParseLug.gf
ParseLzz.gf
ParseMah.gf
ParseMal.gf
ParseMar.gf
ParseMcn.gf
ParseMdf.gf
ParseMhr.gf
ParseMin.gf
ParseMkd.gf
ParseMlg.gf
ParseMlt.gf
ParseMnw.gf
ParseMon.gf
ParseMrj.gf
ParseMwl.gf
ParseMya.gf
ParseMyv.gf
ParseMzn.gf
ParseNan.gf
ParseNap.gf
ParseNau.gf
ParseNav.gf
ParseNds.gf
ParseNep.gf
ParseNno.gf
ParseNor.gf
ParseNov.gf
ParseNya.gf
ParseOci.gf
ParseOri.gf
ParseOss.gf
ParsePag.gf
ParsePam.gf
ParsePap.gf
ParsePcd.gf
ParsePes.gf
ParsePli.gf
ParsePms.gf
ParsePnb.gf
ParsePol.gf
ParsePor.gf
ParsePrg.gf
ParsePus.gf
ParseQue.gf
ParseRmy.gf
ParseRoh.gf
ParseRon.gf
ParseRue.gf
ParseRun.gf
ParseRup.gf
ParseRus.gf
ParseSag.gf
ParseSah.gf
ParseSan.gf
ParseScn.gf
ParseSco.gf
ParseSgs.gf
ParseShi.gf
ParseShn.gf
ParseSin.gf
ParseSlo.gf
ParseSlv.gf
ParseSma.gf
ParseSme.gf
ParseSmn.gf
ParseSmo.gf
ParseSms.gf
ParseSna.gf
ParseSnd.gf
ParseSom.gf
ParseSot.gf
ParseSpa.gf
ParseSqi.gf
ParseSrd.gf
ParseSrn.gf
ParseSrp.gf
ParseStq.gf
ParseSun.gf
ParseSwa.gf
ParseSwe.gf
ParseSzl.gf
ParseTah.gf
ParseTam.gf
ParseTat.gf
ParseTel.gf
ParseTet.gf
ParseTgk.gf
ParseTgl.gf
ParseTha.gf
ParseTir.gf
ParseTon.gf
ParseTpi.gf
ParseTsn.gf
ParseTuk.gf
ParseTur.gf
ParseTyv.gf
ParseUdm.gf
ParseUig.gf
ParseUkr.gf
ParseUrd.gf
ParseUzb.gf
ParseVec.gf
ParseVen.gf
ParseVep.gf
ParseVie.gf
ParseVls.gf
ParseVol.gf
ParseVro.gf
ParseWar.gf
ParseWln.gf
ParseWol.gf
ParseWuu.gf
ParseXal.gf
ParseXho.gf
ParseXmf.gf
ParseYid.gf
ParseYor.gf
ParseYue.gf
ParseZsm.gf
ParseZul.gf
Punctuation.gf
PunctuationSpa.gf
PunctuationX.gf
README.md
WordNet.gf
WordNetAPI.gf
WordNetAar.gf
WordNetAbk.gf
WordNetAce.gf
WordNetAdy.gf
WordNetAfr.gf
WordNetAls.gf
WordNetAlt.gf
WordNetAmh.gf
WordNetAng.gf
WordNetAra.gf
WordNetArc.gf
WordNetArg.gf
WordNetAry.gf
WordNetArz.gf
WordNetAsm.gf
WordNetAst.gf
WordNetAva.gf
WordNetAym.gf
WordNetAzb.gf
WordNetAzj.gf
WordNetBak.gf
WordNetBam.gf
WordNetBan.gf
WordNetBar.gf
WordNetBcl.gf
WordNetBel.gf
WordNetBen.gf
WordNetBis.gf
WordNetBjn.gf
WordNetBod.gf
WordNetBos.gf
WordNetBre.gf
WordNetBul.gf
WordNetBxr.gf
WordNetCat.gf
WordNetCeb.gf
WordNetCha.gf
WordNetChe.gf
WordNetChi.gf
WordNetCho.gf
WordNetChr.gf
WordNetChu.gf
WordNetChv.gf
WordNetChy.gf
WordNetCkb.gf
WordNetCor.gf
WordNetCos.gf
WordNetCrh.gf
WordNetCsb.gf
WordNetCym.gf
WordNetCze.gf
WordNetDan.gf
WordNetDiq.gf
WordNetDiv.gf
WordNetDsb.gf
WordNetDut.gf
WordNetDzo.gf
WordNetEng.gf
WordNetEpo.gf
WordNetEst.gf
WordNetEus.gf
WordNetEwe.gf
WordNetExt.gf
WordNetFao.gf
WordNetFij.gf
WordNetFin.gf
WordNetFrc.gf
WordNetFre.gf
WordNetFrp.gf
WordNetFrr.gf
WordNetFry.gf
WordNetFur.gf
WordNetGag.gf
WordNetGan.gf
WordNetGcr.gf
WordNetGer.gf
WordNetGla.gf
WordNetGle.gf
WordNetGlg.gf
WordNetGlv.gf
WordNetGot.gf
WordNetGrc.gf
WordNetGre.gf
WordNetGsw.gf
WordNetGuj.gf
WordNetHak.gf
WordNetHat.gf
WordNetHau.gf
WordNetHaw.gf
WordNetHeb.gf
WordNetHin.gf
WordNetHrv.gf
WordNetHsb.gf
WordNetHun.gf
WordNetHye.gf
WordNetIbo.gf
WordNetIce.gf
WordNetIdo.gf
WordNetIii.gf
WordNetIku.gf
WordNetIle.gf
WordNetIlo.gf
WordNetIna.gf
WordNetInd.gf
WordNetInh.gf
WordNetIta.gf
WordNetJam.gf
WordNetJav.gf
WordNetJbo.gf
WordNetJpn.gf
WordNetKaa.gf
WordNetKab.gf
WordNetKal.gf
WordNetKan.gf
WordNetKat.gf
WordNetKau.gf
WordNetKaz.gf
WordNetKbd.gf
WordNetKcg.gf
WordNetKhm.gf
WordNetKik.gf
WordNetKin.gf
WordNetKir.gf
WordNetKoi.gf
WordNetKor.gf
WordNetKpv.gf
WordNetKrc.gf
WordNetKsh.gf
WordNetKur.gf
WordNetLad.gf
WordNetLao.gf
WordNetLat.gf
WordNetLav.gf
WordNetLbe.gf
WordNetLez.gf
WordNetLfn.gf
WordNetLij.gf
WordNetLim.gf
WordNetLin.gf
WordNetLit.gf
WordNetLld.gf
WordNetLmo.gf
WordNetLtg.gf
WordNetLtz.gf
WordNetLug.gf
WordNetLzz.gf
WordNetMah.gf
WordNetMal.gf
WordNetMar.gf
WordNetMcn.gf
WordNetMdf.gf
WordNetMhr.gf
WordNetMin.gf
WordNetMkd.gf
WordNetMlg.gf
WordNetMlt.gf
WordNetMnw.gf
WordNetMon.gf
WordNetMrj.gf
WordNetMwl.gf
WordNetMya.gf
WordNetMyv.gf
WordNetMzn.gf
WordNetNan.gf
WordNetNap.gf
WordNetNau.gf
WordNetNav.gf
WordNetNds.gf
WordNetNep.gf
WordNetNno.gf
WordNetNor.gf
WordNetNov.gf
WordNetNya.gf
WordNetOci.gf
WordNetOri.gf
WordNetOss.gf
WordNetPag.gf
WordNetPam.gf
WordNetPap.gf
WordNetPcd.gf
WordNetPes.gf
WordNetPli.gf
WordNetPms.gf
WordNetPnb.gf
WordNetPol.gf
WordNetPor.gf
WordNetPrg.gf
WordNetPus.gf
WordNetQue.gf
WordNetRmy.gf
WordNetRoh.gf
WordNetRon.gf
WordNetRue.gf
WordNetRun.gf
WordNetRup.gf
WordNetRus.gf
WordNetSag.gf
WordNetSah.gf
WordNetSan.gf
WordNetScn.gf
WordNetSco.gf
WordNetSgs.gf
WordNetShi.gf
WordNetShn.gf
WordNetSin.gf
WordNetSlo.gf
WordNetSlv.gf
WordNetSma.gf
WordNetSme.gf
WordNetSmn.gf
WordNetSmo.gf
WordNetSms.gf
WordNetSna.gf
WordNetSnd.gf
WordNetSom.gf
WordNetSot.gf
WordNetSpa.gf
WordNetSqi.gf
WordNetSrd.gf
WordNetSrn.gf
WordNetSrp.gf
WordNetStq.gf
WordNetSun.gf
WordNetSwa.gf
WordNetSwe.gf
WordNetSzl.gf
WordNetTah.gf
WordNetTam.gf
WordNetTat.gf
WordNetTel.gf
WordNetTet.gf
WordNetTgk.gf
WordNetTgl.gf
WordNetTha.gf
WordNetTir.gf
WordNetTon.gf
WordNetTpi.gf
WordNetTsn.gf
WordNetTuk.gf
WordNetTur.gf
WordNetTyv.gf
WordNetUdm.gf
WordNetUig.gf
WordNetUkr.gf
WordNetUrd.gf
WordNetUzb.gf
WordNetVec.gf
WordNetVen.gf
WordNetVep.gf
WordNetVie.gf
WordNetVls.gf
WordNetVol.gf
WordNetVro.gf
WordNetWar.gf
WordNetWln.gf
WordNetWol.gf
WordNetWuu.gf
WordNetXal.gf
WordNetXho.gf
WordNetXmf.gf
WordNetYid.gf
WordNetYor.gf
WordNetYue.gf
WordNetZsm.gf
WordNetZul.gf
WordnetServer.md
check.hs
constructions.txt
domains.txt
examples.txt
flake.lock
images.txt
reference.txt
sanity.hs
taxonomy.txt

bootstrap

Failed to load latest commit information.

Cannot retrieve latest commit at this time.

Name		Name	Last commit message	Last commit date
parent directory ..
README		README
bootstrap.hs		bootstrap.hs
build.hs		build.hs
images.hs		images.hs
mktranslit.hs		mktranslit.hs
names.py		names.py
panlex.sql		panlex.sql
semcor.py		semcor.py
spreadsheet.py		spreadsheet.py
taxonomy.hs		taxonomy.hs
translit.txt		translit.txt
wn30map31.txt		wn30map31.txt

README

how to bootstrap a new concrete WordNet grammar

If your target language has a Translate grammar but no WordNet, use the migrate.hs script to bootstrap it, and you are done. If your language has both, it is better to bootstrap the new concrete using WordNet. Else keep on reading.

Our end goal is to map GF abstract fun names to lemmas in our target language. We do this by matching them by their WordNet synsets, so this method is only suitable for languages which have a (hopefully decently-sized) WordNet available in the format used by the OMW. The conversion is done by a script that uses standard terminal utilities (GNU versions). If you are on Mac OS X, you could install the gnu-coreutils library, or you could spin a Docker instance of Ubuntu.

obtain translation predictions

Not all words in a given synset in English are equally good translations for any word in your target language’s synset. Krasimir Angelov developed an heuristic algorithm to decide automatically which translation pairs seem to be good matches. You can find the details in this paper, and run the algorithm following the built-in instructions, which should yield you a predictions.tsv file, which you must place in this repository.

abstract syntax names and lemmas map

call

bash make-abstract-lemmas-map.bash

from this repository to create a file named fun-lemmas.tsv that maps abstract syntax names their corresponding lemmas in your target language.

building WordNet***

You now have several options to obtain your target language’s concrete WordNet module. If your target language already has some large dictionary module in GF, you can map its linearizations to the abstract syntax names names by matching lemmas. Another option is to apply smart paradigms to these lemmas in order to obtain tentative linearizations. Ideally you’ll have a large scale morphological resource which you can then check the GF linearizations against.

MakeDictFromLemmas.hs can build a rudimentary WordNet concrete by using the simplest smart paradigms. This will require extensive manual revision afterwards. It takes two filepaths as arguments, the first is a file in the same format as fun-lemmas.tsv, and the other is the name of the output file. You can customize the output language name and the functions to be applied (to some extent).

merge old checked WordNet module with new machine-generated one

let’s say your WordNet got updated, and you would like to carry over the updates without losing the linearizations you have already checked. you can do this by calling

bash merge-wordnet-modules.bash OLD_MODULE NEW-MODULE > WordNetXXX.gf

you’ll have to add the headers manually, though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

bootstrap

bootstrap

README

how to bootstrap a new concrete WordNet grammar

obtain translation predictions

abstract syntax names and lemmas map

building WordNet***

merge old checked WordNet module with new machine-generated one

Collapse file tree

Files

bootstrap

Directory actions

More options

Directory actions

More options

Latest commit

History

bootstrap

Folders and files

parent directory

README

how to bootstrap a new concrete WordNet grammar

obtain translation predictions

abstract syntax names and lemmas map

building WordNet***

merge old checked WordNet module with new machine-generated one