Skip to content

This project compares deep learning techniques for multilingual text classification, focusing on language detection and classification using FastText and Sentence Transformer embeddings. It provides a dataset, requirements, and highlights the significance of training on a large multilingual corpus for improved performance.

License

Notifications You must be signed in to change notification settings

jaywyawhare/Language-Prediction

Repository files navigation

Language Prediction

Requirements

The project requires the following packages to be installed:

  • Pandas
  • NumPy
  • Scikit-learn
  • LangDetect
  • LangId
  • FastText
  • Sentence Transformer
  • TensorFlow
  • Matplotlib
  • Seaborn

To install these packages, you can run the following command:

> pip install pandas numpy scikit-learn langdetect langid fasttext sentence-transformers tensorflow matplotlib seaborn

To run the project, you can follow these steps:

  • Clone the repository to your local machine:
> git clone https://github.com/jaywyawhare/Language-Prediction.git
  • Navigate to the project directory:
> cd Language-Prediction
  • Install the required packages:
> pip install -r requirements.txt
  • To deploy it:
> streamlit run main.py

The script will load the dataset, preprocess the data, train and evaluate the models, and display the results.

About

This project compares deep learning techniques for multilingual text classification, focusing on language detection and classification using FastText and Sentence Transformer embeddings. It provides a dataset, requirements, and highlights the significance of training on a large multilingual corpus for improved performance.

Topics

Resources

License

Stars

Watchers

Forks