You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/index.Rmd
+43-49Lines changed: 43 additions & 49 deletions
Original file line number
Diff line number
Diff line change
@@ -37,13 +37,13 @@ NOM = nominative, GEN = nominative, DAT = nominative, ACC = accusative, VOC = ac
37
37
38
38
Besides the obvious errors, this list contains more problems that I would like to point out:
39
39
40
-
* the lack of alphabetic order;
41
-
* some abbreviations used in the article (`r add_gloss("SBJV")`, `r add_gloss("IMP")`) are absent in the list.
40
+
- the lack of alphabetic order;
41
+
- some abbreviations used in the article (`r add_gloss("SBJV")`, `r add_gloss("IMP")`) are absent in the list.
42
42
43
43
The main goal of the `lingglosses` R package is to provide an option for creating:
44
44
45
-
* interlinear glossed linguistic glosses for an `.html` output of `rmarkdown`[@xie18][^latex];
46
-
* a semi-automatically compiled list of glosses.
45
+
- interlinear glossed linguistic glosses for an `.html` output of `rmarkdown`[@xie18][^latex];
46
+
- a semi-automatically compiled list of glosses.
47
47
48
48
[^latex]: If you want to render a `.pdf` version you can either use latex and multiple linguistic packages developed for it (see e. g. [`gb4e`](https://www.ctan.org/pkg/gb4e), [`langsci`](https://www.ctan.org/pkg/langsci), [`expex`](https://www.ctan.org/pkg/expex), [`philex`](https://www.ctan.org/pkg/philex)), or you can render `.html` first and convert it to `.pdf` afterwards.
49
49
@@ -74,13 +74,13 @@ You can go through the examples in this tutorial or you can create a lingglosses
74
74
75
75
The main function of the `lingglosses` package is `gloss_example()`. This package has the following arguments:
76
76
77
-
*`transliteration`;
78
-
*`glosses`;
79
-
*`free_translation`;
80
-
*`comment`;
81
-
*`grammaticality`;
82
-
*`annotation`[^orth];
83
-
*`line_length`.
77
+
-`transliteration`;
78
+
-`glosses`;
79
+
-`free_translation`;
80
+
-`comment`;
81
+
-`grammaticality`;
82
+
-`annotation`[^orth];
83
+
-`line_length`.
84
84
85
85
[^orth]: I used `annotation` for representing orthography, but it also possible to use this tier for the annotation of words, like here:
* the transliteration line is italic by default (if you do not want it, just add the argument `italic_transliteration = FALSE`)[^ital-all];
107
-
* you can use standard markdown syntax (e.g. `**a**` for **bold**);
108
-
* the free translation line is automatically framed with quotation marks.
106
+
- the transliteration line is italic by default (if you do not want it, just add the argument `italic_transliteration = FALSE`)[^ital-all];
107
+
- you can use standard markdown syntax (e.g. `**a**` for **bold**);
108
+
- the free translation line is automatically framed with quotation marks.
109
109
110
110
[^ital-all]: Sometimes it is make sense to set this option ones for the whole document using the following code `options("lingglosses.italic_transliteration" = FALSE)`.
111
111
@@ -145,8 +145,8 @@ With the names settled you can reference the example (@my_ex) in the text using
145
145
146
146
So this kind of example referencing can be used with `lingglosses` examples like in (@lingglosses1) and (@lingglosses2). The only important details are:
147
147
148
-
* change your code chunk argument to `echo = FALSE` (or specify it for all code chunks with the following comand in the begining of the document `knitr::opts_chunk$set(echo = FALSE")`);
149
-
* do not put an empty line between the reference line (with `(@...)`) and the code chunk with `lingglosses` code.
148
+
- change your code chunk argument to `echo = FALSE` (or specify it for all code chunks with the following comand in the begining of the document `knitr::opts_chunk$set(echo = FALSE")`);
149
+
- do not put an empty line between the reference line (with `(@...)`) and the code chunk with `lingglosses` code.
150
150
151
151
(@lingglosses1)
152
152
```{r, echo = FALSE}
@@ -358,17 +358,11 @@ It is really important that one should not treat the results of the `make_gloss_
358
358
359
359
# Other output formats
360
360
361
-
Right now there is no direct way of knitting `lingglosses` to `.docx` format. You can knit by adding an argument `always_allow_html: true` to your yaml file, however the result will be not ideal. You can work around this by copying and pasting from the `.html` version:
362
-
363
-
```{r, echo = FALSE}
364
-
knitr::include_graphics("for_word_users.gif")
365
-
```
366
-
367
361
Both kniting to `.pdf` and `.docx` outputs are possible, but there are some known restrictions:
368
362
369
-
* markdown bold and italic annotations do not work;
370
-
* example numbers appear above the example;
371
-
* there is no non-breaking space in the list of glosses.
363
+
- markdown bold and italic annotations do not work;
364
+
- example numbers appear above the example;
365
+
- there is no non-breaking space in the list of glosses.
372
366
373
367
So if you want to avoid these problems, the best solution is to use one of the latex glossing packages listed in the first footnote and the package [`glossaries`](https://www.ctan.org/pkg/glossaries) for automatic compilation of glosses.
374
368
@@ -386,22 +380,22 @@ Most definitions are too general on purpose: `r add_gloss("ASC")`, for example,
386
380
387
381
There have been several alternative to `lingglosses` infrastructure for interlinear glossed examples that might be interesting for the reader:
388
382
389
-
* multiple packages for glossing in LaTeX:
390
-
*[`gb4e`](https://www.ctan.org/pkg/gb4e),
391
-
*[`langsci`](https://www.ctan.org/pkg/langsci),
392
-
*[`expex`](https://www.ctan.org/pkg/expex),
393
-
*[`philex`](https://www.ctan.org/pkg/philex)
394
-
*[ODIN project](https://odin.linguistlist.org/)[@lewis10] (looks like this project is not longer active);
- a Python library [`Xigt`](https://github.com/xigt/xigt)[@goodman15];
391
+
-[scription format](https://github.com/digitallinguistics/scription) and [scription2dlx Java-script library](https://github.com/digitallinguistics/scription2dlx)[@hieber20];
392
+
- a Python library [`pyigt`](https://github.com/cldf/pyigt)[@list21b].
399
393
400
394
Only several of them (`ODIN`, `Xigt`, `scription` and `pyigt`) are attempts towards creating a standard for the databases of interlinear glossed examples. I also wanted to mention paper by [@round20], where authors provided a [script](https://github.com/erichround/LREC_IGT/) for the automated identification and parsing of interlinear glossed text from scanned page images. The motivation for creating cross-linguistic database of interlinear glossed examples is the following:
401
395
402
-
* Prevent from disappearing of linguistic facts due to the projects fail (for example field notes of the researcher that did not manage to finish his work: article, dictionary, grammar etc.);
403
-
* Fight with the publication bias, which cause some linguistic facts left unpublished since they not support a basic idea of author;
404
-
* Make linguistic work more reproducible and linguistic facts reusable (cf. with human genome database, biodiversity databases or astronomical catalogues).
396
+
- Prevent from disappearing of linguistic facts due to the projects fail (for example field notes of the researcher that did not manage to finish his work: article, dictionary, grammar etc.);
397
+
- Fight with the publication bias, which cause some linguistic facts left unpublished since they not support a basic idea of author;
398
+
- Make linguistic work more reproducible and linguistic facts reusable (cf. with human genome database, biodiversity databases or astronomical catalogues).
405
399
406
400
The `lingglosses` package make an attempt for going in this direction and provide an ability to extract examples in table format that can be further transformed into other formats. Each interlinear glossed example could be easily represented as a table using the `convert_to_df()` function.
This table lists all the parameters that could be useful for a database, and has the following columns:
417
411
418
-
*`id` --- unique identifier through the whole table;
419
-
*`example_id` --- unique identifier of particular examples;
420
-
*`word_id` --- unique identifier of the word in the example (delimited with spaces and other punctuation);
421
-
*`morpheme_id` --- unique identifier of the morpheme within the word (delimited with `-` or `=`);
422
-
*`transliteration` --- language material;
423
-
*`gloss` --- glosses;
424
-
*`delimiter` --- delimiters: space, `-` or `=`
425
-
*`transliteration_orig` --- original string with transliteration;
426
-
*`glosses_orig` --- original string with glosses;
427
-
*`free_translation` --- original string with the free translation;
428
-
*`comment` --- original string with a comment;
412
+
-`id` --- unique identifier through the whole table;
413
+
-`example_id` --- unique identifier of particular examples;
414
+
-`word_id` --- unique identifier of the word in the example (delimited with spaces and other punctuation);
415
+
-`morpheme_id` --- unique identifier of the morpheme within the word (delimited with `-` or `=`);
416
+
-`transliteration` --- language material;
417
+
-`gloss` --- glosses;
418
+
-`delimiter` --- delimiters: space, `-` or `=`
419
+
-`transliteration_orig` --- original string with transliteration;
420
+
-`glosses_orig` --- original string with glosses;
421
+
-`free_translation` --- original string with the free translation;
422
+
-`comment` --- original string with a comment;
429
423
430
424
When you use the `gloss_example()` function, a table of the structure described above is added to the database, so in the end you can extract it by saving the output of the `get_examples_db()` function to the file:
0 commit comments