🌱 SeedGuard AI

Mapping the GMO Narrative Landscape for True Agricultural Empowerment & Sovereignty

📘 Overview

SeedGuard AI is a Natural Language Processing (NLP) and Machine Learning (ML) project that explores the global conversation surrounding genetically modified organisms (GMOs) and seed sovereignty across Africa. By collecting thousands of publicly available discussions from X App (Twitter) and Reddit, our system identifies, analyzes, and maps emerging narratives, emotional tones, and stakeholder perspectives shaping Africa’s agricultural future.

🎯 Objectives

Collect and preprocess text data from online platforms such as X (Twitter) or QUORA (alternatively) and Reddit using targeted keywords (e.g., African seed sovereignty, agroecology, genetically modified maize, genetically modified food, GMO in Africa, etc.).
Apply advanced NLP models (DistilBERT, BERTopic) to automatically cluster, categorize, and analyze public opinions.
Map connected topics, sentiment shifts, and emotional distributions across African regions.
Support evidence-based discussions and policy decisions on agricultural sovereignty and biotechnology.

🧩 Project Workflow

1️⃣ Data Collection & Preprocessing

Tools: Apify, PRAW, BeautifulSoup, requests
Cleaning: Remove stopwords, URLs, emojis, duplicates.
Languages: English (focus), others optional via translation models.
Output: Structured CSV/JSON for modelling.

2️⃣ NLP Modeling & Clustering

Embedding Model: DistilBERT via sentence-transformers
Topic Modelling: BERTopic with UMAP + HDBSCAN
Sentiment Analysis: VADER, TextBlob
Classification: Predict stakeholder groups (e.g., farmers, companies, NGOs).

3️⃣ Visualization & Insights

Tools: Plotly, Seaborn, Matplotlib, Folium
Dashboards: Topic frequency charts, sentiment maps, stakeholder influence graphs.
Platforms: Streamlit or Dash for deployment.

🏗️ Folder Structure

C9-team-seedguard/
│
├── data/
│                 # Unprocessed data (excluded from Git)
│   
│   
│
├── notebooks/              # Jupyter notebooks for exploration
│
│   ├── preprocessed.ipynb
│   ├── classification.ipynb
│   ├── topic_modeling.ipynb
│   ├── sentiment_analysis.ipynb
│   └── visualization.ipynb
│
├── results/                # Generated insights, plots, figures
├── requirements_data.txt   # Dependencies for data & visualization
├── requirements_model.txt  # Dependencies for model training
├── .gitignore
└── README.md

⚙️ Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/AISaturdaysLagos/C9-team-seedguard.git
cd C9-team-seedguard

2️⃣ Create a Virtual Environment

python -m venv venv
venv\Scripts\activate   # For Windows
# or
source venv/bin/activate  # For Mac/Linux

3️⃣ Install Dependencies

For data processing only:

pip install -r requirements_data.txt

For model training:

pip install -r requirements_model.txt

4️⃣ Setup API Keys

Create a .env file and add your credentials:

TWITTER_BEARER_TOKEN=your_twitter_api_key
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_SECRET=your_reddit_secret

🧠 Key Technologies

Data Collection: apify, PRAW, BeautifulSoup
NLP & Modeling: DistilBERT, BERTopic, Transformers, HDBSCAN, UMAP
Visualization: Plotly, Folium, Seaborn, Dash, Streamlit
Languages: Python, Jupyter Notebook

📊 Expected Deliverables

Clean dataset of GMO-related discussions across Africa.
NLP model capable of identifying key narratives and stakeholder categories.
Interactive visualization dashboard for exploring topic and sentiment trends.
Technical report summarizing findings, insights, and recommendations.

👥 Team Roles

Role	Name	Responsibilities
Tech Lead	Ismail O. Daud	Project structure, workflow, integration, model supervision
Data Engineers	Agoro Zeenat	Data collection, cleaning, and preprocessing
ML Engineers	Elijah Aremu, Tijani O.	Topic modeling, embeddings, classification
Visualization Analysts	All team members	Charts, dashboards, storytelling
Documentation Lead	Ismail O. Daud	Reports, README, presentation materials

🚀 Future Extensions

Add multilingual sentiment analysis for Francophone and Arabic-speaking Africa.
Integrate real-time social media tracking APIs.
Develop a Streamlit dashboard with live trend updates.
Deploy models to Hugging Face or Google Cloud for public access.

📜 License

This project is developed under the AI Saturdays Lagos Cohort 9 program for machine learning and research purpose.

💬 Contact

For inquiries or collaborations:

📧 [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌱 SeedGuard AI

📘 Overview

🎯 Objectives

🧩 Project Workflow

1️⃣ Data Collection & Preprocessing

2️⃣ NLP Modeling & Clustering

3️⃣ Visualization & Insights

🏗️ Folder Structure

⚙️ Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Setup API Keys

🧠 Key Technologies

📊 Expected Deliverables

👥 Team Roles

🚀 Future Extensions

📜 License

💬 Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
data		data
docs		docs
notebooks		notebooks
results		results
.gitignore		.gitignore
187.jpg		187.jpg
README.md		README.md
compiled_dataset.py		compiled_dataset.py
requirements_data.txt		requirements_data.txt
requirements_model.txt		requirements_model.txt

AISaturdaysLagos/C9-team-seedguard

Folders and files

Latest commit

History

Repository files navigation

🌱 SeedGuard AI

📘 Overview

🎯 Objectives

🧩 Project Workflow

1️⃣ Data Collection & Preprocessing

2️⃣ NLP Modeling & Clustering

3️⃣ Visualization & Insights

🏗️ Folder Structure

⚙️ Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Setup API Keys

🧠 Key Technologies

📊 Expected Deliverables

👥 Team Roles

🚀 Future Extensions

📜 License

💬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages