A comprehensive machine learning project for classifying images of fruits and vegetables using Convolutional Neural Networks (CNNs). This project includes both a training pipeline and a web application for real-time image classification.
- Deep Learning Model: Custom CNN architecture built with TensorFlow/Keras
- Web Application: Interactive Streamlit app for real-time predictions
- Comprehensive Dataset: 18 different fruits and vegetables with training, validation, and test sets
- Image Preprocessing: Automated image preprocessing pipeline
- Model Evaluation: Training history visualization and performance metrics
- Easy Deployment: Ready-to-deploy web application
The project uses a diverse dataset containing images of the following categories:
- π Apple
- π Banana
- π₯ Beetroot
- π« Bell Pepper
- π₯¬ Cabbage
- πΆοΈ Capsicum
- π₯ Carrot
- π₯¦ Cauliflower
- πΆοΈ Chilli Pepper
- π½ Corn
- π₯ Cucumber
- π Eggplant
- π§ Garlic
- π« Ginger
- π Grapes
- πΆοΈ JalapeΓ±o
- π₯ Kiwi
- π Lemon
dataset/
βββ train/ # Training images (80% of data)
βββ validation/ # Validation images (10% of data)
βββ test/ # Test images (10% of data)
Each category contains approximately 10 images for training and validation, ensuring balanced representation.
The CNN model consists of:
- Input Layer: 180x180 RGB images
- Convolutional Layers: 3 Conv2D layers with increasing filters (16, 32, 64)
- Pooling Layers: MaxPooling2D for downsampling
- Regularization: Dropout layer (0.2) to prevent overfitting
- Dense Layers: 128 neurons with ReLU activation
- Output Layer: Softmax activation for multi-class classification
- Optimizer: Adam
- Loss Function: Sparse Categorical Crossentropy
- Metrics: Accuracy
- Batch Size: 32
- Epochs: 25
- Image Size: 180x180 pixels
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/NhanPhamThanh-IT/ThaoMyProject.git cd ThaoMyProject -
Create a virtual environment (recommended)
python -m venv venv # On Windows venv\Scripts\activate # On macOS/Linux source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Download the dataset Place your fruit and vegetable images in the
dataset/directory following the structure shown above.
-
Navigate to the notebooks directory
cd notebooks -
Run the training notebook
python notebook.py
This will:
- Load and preprocess the dataset
- Train the CNN model
- Save the trained model to
models/model.keras - Display training history plots
-
Navigate to the app directory
cd app -
Launch the Streamlit app
streamlit run main.py
-
Access the application Open your browser and go to
http://localhost:8501
The web app allows you to:
- Enter an image filename from the
public/images/directory - View the image preview
- See the predicted fruit/vegetable category
- Check the prediction confidence score
ThaoMyProject/
βββ app/ # Streamlit web application
β βββ main.py # Main application file
β βββ config/ # Configuration management
β β βββ __init__.py
β β βββ InitConfig.py
β βββ utils/ # Utility functions
β βββ __init__.py
β βββ ImageProcessing.py
βββ data/ # Configuration files
β βββ config.json # App configuration
βββ dataset/ # Image dataset
β βββ train/
β βββ validation/
β βββ test/
βββ models/ # Trained models
β βββ model.keras
βββ notebooks/ # Jupyter notebooks
β βββ notebook.py # Training notebook
β βββ notebook.pdf # Notebook documentation
βββ public/ # Public assets
β βββ images/ # Sample images
βββ requirements.txt # Python dependencies
βββ LICENSE # MIT License
βββ README.md # Project documentation
The application uses a JSON configuration file (data/config.json) that includes:
- App Settings: Title, description, and UI parameters
- Model Settings: Image dimensions and preprocessing parameters
- Categories: List of fruit and vegetable classes
- Paths: File paths for models and data
Example configuration:
{
"APP": {
"TITLE": "Fruit & Vegetable Classifier"
},
"IMG_SHAPE": {
"width": 180,
"height": 180
},
"CATEGORIES": [
"apple",
"banana",
"beetroot",
"bell pepper",
"cabbage",
"capsicum",
"carrot",
"cauliflower",
"chilli pepper",
"corn",
"cucumber",
"eggplant",
"garlic",
"ginger",
"grapes",
"jalepeno",
"kiwi",
"lemon"
]
}The trained model achieves:
- Training Accuracy: ~95% (varies based on dataset quality)
- Validation Accuracy: ~90% (varies based on dataset quality)
- Test Accuracy: ~88% (varies based on dataset quality)
Performance may vary depending on:
- Quality and diversity of training images
- Number of training epochs
- Model architecture complexity
- Image preprocessing techniques
We welcome contributions from the community! Here's how you can help:
- π Report Bugs: Use GitHub Issues to report bugs
- π‘ Suggest Features: Propose new features or improvements
- π Improve Documentation: Help make our docs better
- π§ Code Contributions: Submit pull requests with fixes or enhancements
- π Dataset Expansion: Add more images or categories
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name - Make your changes and test thoroughly
- Commit your changes:
git commit -m 'Add some feature' - Push to the branch:
git push origin feature/your-feature-name - Open a Pull Request
- Follow PEP 8 style guidelines for Python code
- Add docstrings to new functions and classes
- Write tests for new features
- Update documentation for any changes
- Ensure all tests pass before submitting
The project requires the following Python packages:
numpy- Numerical computingmatplotlib- Data visualizationtensorflow- Deep learning frameworkstreamlit- Web application framework
-
Model Loading Errors
- Ensure
models/model.kerasexists - Check TensorFlow version compatibility
- Ensure
-
Image Loading Errors
- Verify image paths are correct
- Ensure images are in supported formats (JPG, PNG)
- Check image dimensions match model expectations
-
Streamlit App Issues
- Ensure all dependencies are installed
- Check that the port 8501 is available
- Verify configuration file exists
-
Training Issues
- Ensure dataset directory structure is correct
- Check that all categories have sufficient images
- Verify GPU availability for faster training
This project is licensed under the MIT License - see the LICENSE file for details.
- TensorFlow/Keras: For providing the deep learning framework
- Streamlit: For the web application framework
- Open Source Community: For inspiration and tools
- Dataset Contributors: For providing the fruit and vegetable images
- Author: Nhan Pham Thanh
- GitHub: @NhanPhamThanh-IT
- Project Link: https://github.com/NhanPhamThanh-IT/ThaoMyProject
- Add more fruit and vegetable categories
- Implement data augmentation techniques
- Add model quantization for mobile deployment
- Create REST API for model serving
- Add batch processing capabilities
- Implement model explainability features
- Add support for custom image uploads
- Create mobile application version
β Star this repository if you find it helpful!
π Happy classifying! π