PAVEL was a collection of resources for researchers and academics in the field of linguistics maintained by the government of Canada. Due to change in politics, this collection was abandoned and gifted to the Language Technologies Research Centre for preservation.
Since most of these resource were built for an outdated version of a web development framework in use by the government of Canada, it was required to extract the information in this massive amount of documents and web pages to convert them to the TikiWiki format in use at LTRC.
I built PAVELINTR, a pun on the words "PAVEL" and "linter", to convert these resources to a format that could be hosted on the LTRC's infrastructure.
I have since removed the source and output files from this project. The patterns I have used may still prove useful to someone facing the same problems.
The PAVEL and any other trademarks in these files are properties of their respective owners.
This source code is distributed under the GNU GENERAL PUBLIC LICENSE.