| 
  | 
| Repo | Description | Status | Note | 
|---|---|---|---|
Python bindings to Marian C++;   | 
Complete✅  | 
||
Facilitating human evaluation of chatbots;   | 
Complete✅  | 
||
Distilling strong reference based metrics into stronger reference-less metrics  | 
Complete✅  | 
||
A streaming approach to machine translation training.   | 
Complete✅  | 
||
Many-to-English (v2)  | 
Complete✅  | 
||
Transformer ablation, showing that model can work without encoder.  | 
Complete✅  | 
||
Parallel sentence alignment from Universal Declaration of Human Rights corpus  | 
WIP/Incomplete◒  | 
||
Done  | 
Complete✅  | 
||
Macro sampling in BERT  | 
Didn’t work❌  | 
Maybe we should revisit  | 
|
Imbalanced machine learning: case studies in image recognition, text classification, and machine translation  | 
Incomplete◒  | 
||
A theory on hyperparameter  | 
Incomplete◒  | 
Book idea! Needs more time. 🕙  | 
|
A survey of NMT toolkits  | 
Incomplete◒  | 
Lost interest  | 
|
Macro-averaged evaluation for automatic speech recognition  | 
Incomplete◒  | 
(Some positive results, but needs more evidence)  | 
|
Macro Average: Rare Types are Important Too  | 
Complete✅  | 
||
Many-to-English machine translation tools, data, and pretrained models  | 
Complete✅  | 
||
Finding the optimal vocabulary size for neural machine translation  | 
Complete ✅  | 
||
Neural machine translation with imbalanced classes  | 
Complete ✅  | 
Rejected from *ACL; Arxiv link  | 
|
NMT learning curve revisited.  | 
Complete ✅  | 
Not published  | 
|
An Approach for Automatic and Large Scale Image Forensics  | 
Complete ✅  | 
||
Clustering webpages based on structure and style similarity  | 
Complete✅  | 





