I was trying to use the persistence code to identify when a specific url was added to an article, but ran into an issue with the wiktiext_split function breaking up urls:
>>> from mw.lib.persistence.tokenization import wikitext_split
>>> wikitext_split('Something blah blah http://foobar.com')
['Something', ' ', 'blah', ' ', 'blah', ' ', 'http', ':', '/', '/', 'foobar', '.', 'com']
It would be nice if urls were special-cased and kept together.