tokenizer not correctly splitting some contractions

The tokenizer is not following standard contraction tokenization [0], expected by the Stanford POS tagger. Contractions are not splitted and should be.

Also, the apostrophe character ´ is not handled.

[0] http://www.cis.upenn.edu/~treebank/tokenization.html