Using a Naive Bayes Model and trained data, it is possible to classify series' of Strings into being "spam" or "ham", based on kitchen-sink patterns IE that words that were previously spam can be identified as spam-centric again.
Download this doc folder. Open the index file in your browser to view the documentation.
- Adding and removing words with their own spam/ham counts
- No external libraries used (besides Apache CSV Reader)
- Extremely simple abstract class and interface
- Uses a replicatable, basic, Naive Bayes formula
For support, email [email protected] or message me on github.
MIT License - Public Domain