Skip to content

apertium/apertium-eng

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English: apertium-eng

This is an Apertium monolingual language package for English. What you can use this language package for:

  • Morphological analysis of English
  • Morphological generation of English
  • Part-of-speech tagging of English

Requirements

You will need the following software installed:

  • lttoolbox (>= 3.5.0)
  • apertium (>= 3.6.0)
  • vislcg3 (>= 1.3.0)

If this does not make any sense, we recommend you look at: https://apertium.org

Compiling

Given the requirements being installed, you should be able to just run:

$ ./autogen.sh
$ make

If you're doing development, you don't have to install the data, you can use it directly from this directory.

If you are installing this language package as a prerequisite for an Apertium translation pair, then do (typically as root / with sudo):

# make install

You can give a --prefix to ./autogen.sh to install as a non-root user, but make sure to use the same prefix when installing the translation pair and any other language packages.

If any of this doesn't make sense or doesn't work, see https://wiki.apertium.org/wiki/Install_language_data_by_compiling

Testing

If you are in the source directory after running make, the following commands should work:

$ echo "the blue house" | apertium -d . eng-morph ^the/the$ ^blue/blue/blue$ ^house/house/house/house/house$

$ echo "the blue house" | apertium -d . eng-tagger ^the$ ^blue$ ^house$

Tagger model training

To train the tagger model, do one of the following:

Supervised training:

$ make -f tagger.supervised.make

Unsupervised training

$ make -f tagger.unsupervised.make

For details on the corpora used in training, check the corpora information.

For more information, see https://wiki.apertium.org/wiki/Tagger_training

A perceptron tagger model is also included as eng.perceptron.prob. If you want to use it, it needs to be called with -gx: apertium-tagger -gx eng.perceptron.prob

Files and data

For more information

Help and support

If you need help using this language pair or data, you can contact:

  • Mailing list: [email protected]
  • IRC: #apertium on irc.oftc.net (irc://irc.oftc.net/#apertium)

See also the file AUTHORS, included in this distribution.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

About

Apertium linguistic data for English

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 27

Languages