A text classifier that detects low-quality writing. Uses an SVM model trained on stylistic features: capitalization patterns, punctuation density, word length, and spelling indicators.
echo "your text here" | bin/stupidfilter data/c_rbfOutput: 0.0 = low-quality, 1.0 = acceptable. Values between indicate confidence.
Strip HTML and normalize whitespace before classification. See classify.sh for an example.
Dependencies:
# Debian/Ubuntu
sudo apt install g++ flex libboost-serialization-dev
# Fedora/RHEL
sudo dnf install gcc-c++ flex boost-develBuild:
makeThis produces bin/stupidfilter. The build uses system Boost headers (not the bundled 2008 versions in thirdparty/boost.old).
To rebuild the lexer from source:
flex -o stupidfilter.cpp fclassify.flexRequires Rust 1.70+:
cd rust
cargo build --releaseThis produces rust/target/release/stupidfilter. Run it the same way:
echo "test text" | ./target/release/stupidfilter ../data/c_rbfThe Rust port produces identical classifications and runs 1.7–2.1× faster than C++.
Originally released in 2008 by Rarefied Technologies under GPL v2. Updated in 2026 to build on modern systems (GCC 14, 64-bit Linux) and ported to Rust.