We want a multilingual text classifier that predicts the sentiment polarity of a given text. Build two models that satisfy this task and choose the best one. Please explain in detail each choice you made when building your models and how you choose the best one.
Possible sentiments:
- positive
- negative
- neutral
- data/train.csv: a training dataset containing 25k multilingual texts annotated with their corresponding sentiment
- data/test.csv: a test dataset containing 2500 multilingual texts
- Code should be written in Python 3
- Code should be easily runnable, provide a pip requirements.txt file or a conda environment.yml file describing code dependencies
- Code should be documented to explain your methodology, choices, and how to run and get results
- Code should output a file predictions.csv, containing the predictions of your best classifier on the test dataset
- Fork the project via github
- Clone your forked repository project https://github.com/YOUR_USERNAME/sentiment-analysis-test
- Commit and push your different modifications
- Send us a link to your fork once you're done!