-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Please take a look at Pull Request #1, in case you've overlooked it. The idea is that rather than releasing and maintaining a whole bunch of Text::Hyphen::XX packages, to release just the one Text::Hyphen that can either be updated manually with desired language files from the CTAN library, or go and fetch them itself, given a language option in new(). At this point I'm not sure if there are any issues with where the cache or library of patterns and exceptions should go, with respect to permissions across a wide range of platforms. I.e., can a random user trigger an action that adds files to the library in their Perl module collection?
Add: Perhaps Hyphen.pm should have a clearly marked and easily changed "where the cache is" setting, and/or an option setting for new(), where it's to write (and read from) all the pattern and exception files. This would get around worrying about some users not having permission to write to certain directories.
Then there's the issue of how you keep this library updated, should CTAN refresh a file. Certainly one way to do it is just as done now, which is to have separate Text::Hyphen::XX packages, and use the normal CPAN update mechanism. However, this means that someone will need to take on building and releasing all these packages in the first place, and keeping them all up to date! I suspect that this was the impetus for PR 1 in the first place, that users wouldn't have to wait for someone to get around to creating a package. By the way, one very useful package would be Latin! How many packages use Ipsum Lorem text for examples, and would like a way to properly hyphenate it?
Add: Perhaps there's a way that I've overlooked, but there doesn't seem to be a way to subscribe to CTAN to tell your local system to update its cache or library of hyphenation data. Maybe the best way would be to periodically run a utility to check last-modified dates on the page, and pull down anything that needs an update? On Linuxy systems, at least, this could be on a cron job (dunno about Windows). Worst case, whenever Text::Hyphen is run, it could check the date/time and run the utility for you? Anyway, there's probably not a lot of changes to such packages once they've settled down.
Anyway, I think it's time to discuss better ways of getting hyphenation support out for a wide range of languages, and CTAN appears to have done much of the work already, if we could just directly read those files (and import them easily).