-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
To Do
Programming
- implement IBM Model 1 on
train/folder (Zoe, Tuesday/Wednesday) - fix ascii-unicode-latin1 encoding issues (Devon, Tuesday night)
- implement caching with pickle in
transl_probs.pickle(Devon, Wednesday night) - multiple iterations
- self.sp_word_indices dictionary instead of binary search
- optimize algorithm ( @luttigdev )
- just reset at each iteration or completely recreate? (applies to multiple data structures)
- lowercase? DOESN'T MATTER
- implement evaluation/run-through for
dev/folder through Bleu ASAP - Viterbi + nltk (parts of speech tagging » reordering Sp-Eng verbs for exmaple)
- didn't help much :(
- NOTE: can't tag single English words because not enough context
- add english language model (single-word probabilities) (Zoe, Thursday)
- bigrams
- translating 2 words ??
- bigrams
- decide new priorities once we get Bleu working
- conjugations in Spanish indicate subject ("Tengo" == "Yo tengo") ... deal with this!
Questions
- logs
- COGNATES (not a good idea)
Misc
- shouldn't remove commas from translation...
- Shouldn't remove " and similar thingies
- seems to remove spaces around it still
Report
- all the things
Useful resources & links
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels