This repository was archived by the owner on Oct 22, 2018. It is now read-only.
Refactoring language processing#22
Open
sepulchered wants to merge 5 commits into
Open
Conversation
| for line in text_lines: | ||
| tagged = [] | ||
| line = line.strip() | ||
| for surface, pos in pt.tag(line): |
Author
There was a problem hiding this comment.
maybe use tag_query method here?
Member
|
A tag_query method sounds good. But please note that the status of this code is unclear at the moment (sorry for not clarifying this in the code itself). We have moved to a system that doesn't need any prior linguistic processing. This is faster, and in some sense more robust: a wrong POS tag can create havoc in the results. The semantic space can also be smaller if less linguistic info is included. We are waiting to do some more testing to decide on whether the code should completely go or not. So perhaps don't do too much work on this now :) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
moved pos tagger, lemmatizer and textblob tagger script into one file
lang_procand refactoring for those modules