Skip to content

Integrate tr Wikidata into Unicode Inflection #55

Open
@grhoten

Description

The revised dictionary-parser can parse Wikidata, but some issues need to be resolved.

The initial issues include:

  • The dictionary-parser output needs to be addressed
  • The unit tests need to be fixed.

Tool output that needs to be addressed:

Line 121594: Q149761 is not a known part of speech grammeme for L999115(a)
Line 136793: Q3517796 is not a known grammeme for L1124158(açılmak)
Line 136848: Q3517796 is not a known grammeme for L1124480(çağırmak)
Line 141477: Q6029894 is not a known grammeme for L1162030(yeni)
Line 164510: Q6029894 is not a known grammeme for L1345861(tatlı)
Line 309031: Q6029894 is not a known grammeme for L1122703(acı)
Line 309149: Q6029894 is not a known grammeme for L1124165(ciddî)
Line 309210: Q3517796 is not a known grammeme for L1124688(çatırdamak)
Line 319114: Q6029894 is not a known grammeme for L1205514(aşırı)
Line 320901: Q6029894 is not a known grammeme for L1219692(tâze)
Line 321044: Q6029894 is not a known grammeme for L1220742(akıllı)
Line 430742: Q6029894 is not a known grammeme for L720973(kaba)
Line 480343: Q3517796 is not a known grammeme for L1124487(çaldırmak)
Line 484885: Q6029894 is not a known grammeme for L1160740(geç)
Line 491597: Q6029894 is not a known grammeme for L1215250(iyi)
Line 491864: Q6548647 is not a known part of speech grammeme for L1217600(bile)
Line 491881: Q6029894 is not a known grammeme for L1217735(güzel)
Line 492601: Q6029894 is not a known grammeme for L1224020(yumuşak)
Line 507574: Q1462657 is not a known part of speech grammeme for L1345299(birbiri)
Line 515262: Q6029894 is not a known grammeme for L1408058(pis)
Line 656068: Q6548647 is not a known part of speech grammeme for L1157495(da)
Line 773414: Q56703580 is not a known grammeme for L721332(çay)
Line 822829: Q3517796 is not a known grammeme for L1124474(çabalamak)
Line 822853: Q3517796 is not a known grammeme for L1124676(çarpmak)
Line 834381: Q6029894 is not a known grammeme for L1217737(küçük)
Line 839462: Q79377486 is not a known grammeme for L1258465(o)
Line 850611: Q6029894 is not a known grammeme for L1348637(ıslak)
Line 853408: Q6029894 is not a known grammeme for L1371198(eski)
Line 944698: Q728001 is not a known grammeme for L720966(kas)
Line 994535: Q6029894 is not a known grammeme for L1122668(ak)
Line 994642: Q6029894 is not a known grammeme for L1124131(canlı)
Line 1000365: Q6029894 is not a known grammeme for L1171763(büyük)
Line 1004595: Q6029894 is not a known grammeme for L1205453(aşağı)
Line 1005839: Q6029894 is not a known grammeme for L1215985(doğru)
Line 1005958: Q6029894 is not a known grammeme for L1216917(genç)
Line 1005993: Q6029894 is not a known grammeme for L1217274(kirli)
Line 1008238: Q361669 is not a known part of speech grammeme for L1235210(dır)
Line 1165665: Q6029894 is not a known grammeme for L1122710(açık)
Line 1175433: Q6029894 is not a known grammeme for L1202410(ayrı)
Line 1192877: Q6029894 is not a known grammeme for L1345998(ağır)
Line 1200546: Q6029894 is not a known grammeme for L1407161(bozuk)
Line 1337105: Q6029894 is not a known grammeme for L1122532(yeşil)
Line 1337277: Q3517796 is not a known grammeme for L1124518(çapalamak)
Line 1372257: Q6029894 is not a known grammeme for L1407151(boş)

Here is the current generated lexical dictionary files to debug the test failures.
tr.zip

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions