Thank you for using NLP-Email-Categorizer! We’re dedicated to ensuring you have a great experience with this text classification pipeline. This document outlines how to seek support, report issues, request features, and find additional resources.
If you encounter issues or have questions about NLP-Email-Categorizer, follow these steps:
-
Check the Documentation:
- Review the README for installation, usage, and notebook details.
- Ensure your environment meets the System Requirements.
-
Search Existing Issues:
- Visit the Issues page to see if your question or issue has been addressed.
-
Explore FAQs:
- Check the FAQs section below for solutions to common problems.
-
Ask the Community:
- Post your question in the GitHub Discussions section for community support.
-
Contact the Maintainer:
- For private or urgent matters, reach out via a private issue or discussion on the GitHub repository.
If you find a bug in NLP-Email-Categorizer:
-
Verify the Issue:
- Ensure you’re using the latest version from the repository.
- Reproduce the issue with a sample dataset in both notebooks.
-
Submit a Bug Report:
- Open a new issue on the Issues page.
- Use the bug report template and include:
- A clear title and description.
- Steps to reproduce the bug (e.g., specific cell, dataset).
- Expected vs. actual behavior.
- Screenshots, error logs, or stack traces.
- Your environment (e.g., Python version, OS, Jupyter version).
-
Follow Up:
- Respond to any questions or requests for clarification from maintainers.
- Test any proposed fixes if requested.
Example:
- Title: "Augmentation Fails with Non-English Text"
- Steps: Run augmentation cell with “réunion importante”, observe error.
- Expected: Synonyms generated or skipped gracefully.
- Actual: KeyError in WordNet lookup.
- Environment: Python 3.9, Google Colab.
Have an idea to improve NLP-Email-Categorizer? We’d love to hear it!
-
Check for Duplicates:
- Search the Issues page to ensure your feature hasn’t been suggested.
-
Submit a Feature Request:
- Open a new issue using the feature request template.
- Provide:
- A clear title and detailed description.
- The problem the feature solves or the benefit it provides.
- Any examples, code snippets, or references to similar functionality.
-
Engage with Feedback:
- Discuss your idea with maintainers and the community.
- Be open to refining the proposal based on feedback.
Example:
- Title: "Add Support for Lemmatization"
- Description: Include NLTK lemmatization in preprocessing to normalize words.
- Benefit: Improves feature extraction by reducing word variations.
Q: Why does the notebook fail to load my dataset?
A: Ensure your dataset is in the correct format (CSV for advanced notebook, TSV for simplified) with Subject and Category columns. Check file encoding (UTF-8 recommended) and path. Example: pd.read_csv('data/email_dataset.csv').
Q: How do I create a sample dataset?
A: Create a CSV file like:
Subject,Category
"Meeting at 10 AM",Meeting
"Free gift card!",SpamSave it as email_dataset.csv in the project directory and update the file_name parameter.
Q: Why is the GUI not displaying in Jupyter?
A: Ensure ipywidgets is installed (pip install ipywidgets) and Jupyter is configured for widgets (jupyter nbextension enable --py widgetsnbextension). In Colab, widgets may require additional setup (!pip install ipywidgets).
Q: Why is model accuracy low?
A: Low accuracy may result from an imbalanced or small dataset, poor preprocessing, or inappropriate features. Try a larger dataset, adjust CountVectorizer parameters (e.g., ngram_range), or use the hyperparameter tuning in the advanced notebook.
Q: Can I use TF-IDF instead of CountVectorizer?
A: Yes! Replace CountVectorizer with TfidfVectorizer from sklearn.feature_extraction.text. Contribute this change to support both options. Example:
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer()Q: Why does augmentation produce no synonyms?
A: WordNet may lack synonyms for some words, especially domain-specific terms. Ensure nltk.download('wordnet') was run, and test with common words (e.g., “meeting”). Enhance augmentation by adding fallback logic or alternative methods.
Join the NLP-Email-Categorizer community to connect with other users and the maintainer:
- GitHub Discussions: Ask questions, share ideas, or discuss features in the Discussions section.
- Issues Page: Report bugs or request features at Issues.
- Maintainer Contact: For direct support, create a private issue or discussion on the GitHub repository.
We aim to respond to questions and issues within 48 hours, though community responses may be faster.
NLP-Email-Categorizer is free and open-source, but your support helps maintain and improve it! Here’s how you can contribute:
- Contribute: Help improve the code, documentation, or community by following the Contributing Guidelines.
- Star the Repository: Show your support by starring the project on GitHub.
- Spread the Word: Share the project with data scientists, NLP enthusiasts, or on social media.
- Provide Feedback: Report bugs or suggest features to enhance the pipeline.
Thank you for your support and for using NLP-Email-Categorizer!