RwandaNameGenderModel

RwandaNameGenderModel is a machine learning model that predicts gender based on Rwandan names — whether a first name, surname, or both in any order. It uses a character-level n-gram approach with a logistic regression classifier to provide fast, interpretable, and highly accurate predictions — achieving 96%+ accuracy on both validation and test sets.

🧠 Model Overview

Type: Classic ML (Logistic Regression)
Input: Rwandan name (flexible: single or full name)
Vectorization: Character-level n-grams (2–3 chars)
Framework: scikit-learn
Training Set: 66,735 names (out of 83,419)
Validation/Test Accuracy: ~96.6%

📁 Project Structure

RwandaNameGenderModel/
├── dataset/
│   └── rwandan_names.csv
├── model/
│   ├── logistic_model.joblib
│   └── vectorizer.joblib
├── logs/
│   └── metrics_log.txt
├── train.py
├── inference.py
├── README.md
└── requirements.txt

🚀 Quickstart

1. Install requirements

pip install -r requirements.txt

2. Train the model

python train.py

3. Predict gender from a name using script

Run interactive inference with:

python inference.py

4. Predict gender from a name using Python code

from joblib import load

model = load("model/logistic_model.joblib")
vectorizer = load("model/vectorizer.joblib")

def predict_gender(name):
    X = vectorizer.transform([name])
    return model.predict(X)[0]

# Flexible input: first name, surname, or both (any order)
predict_gender("Gabriel")                 # Output: "male"
predict_gender("Baziramwabo")             # Output: "male"
predict_gender("Baziramwabo Gabriel")     # Output: "male"
predict_gender("Gabriel Baziramwabo")     # Output: "male"

📈 Performance

Dataset	Accuracy	Precision	Recall	F1-Score
Validation	96.72%	96.90%	96.53%	96.72%
Test	96.64%	96.94%	96.34%	96.64%

Metrics are logged in both logs/metrics_log.txt and TensorBoard format.

🌍 Use Cases

Demographic analysis
Smart form processing
Voice assistant personalization
NLP preprocessing for Rwandan corpora

🛡️ Ethical Note

This model predicts binary gender based on patterns in names and may not reflect self-identified gender. It should not be used in sensitive contexts without consent.

📄 License

This project is maintained by Gabriel Baziramwabo and is open for research and educational use. For commercial use, please contact the author.

🤝 Contributing

We welcome improvements and multilingual extensions. Fork this repo, improve, and submit a PR!

🔗 Links

Benax Technologies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RwandaNameGenderModel

🧠 Model Overview

📁 Project Structure

🚀 Quickstart

1. Install requirements

2. Train the model

3. Predict gender from a name using script

4. Predict gender from a name using Python code

📈 Performance

🌍 Use Cases

🛡️ Ethical Note

📄 License

🤝 Contributing

🔗 Links

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset		dataset
logs		logs
model		model
README.md		README.md
inference.py		inference.py
train.py		train.py

benax-rw/RwandaNameGenderModel

Folders and files

Latest commit

History

Repository files navigation

RwandaNameGenderModel

🧠 Model Overview

📁 Project Structure

🚀 Quickstart

1. Install requirements

2. Train the model

3. Predict gender from a name using script

4. Predict gender from a name using Python code

📈 Performance

🌍 Use Cases

🛡️ Ethical Note

📄 License

🤝 Contributing

🔗 Links

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages