TomatoMAP is a novel dataset generated from our multi camera array based on findability, accessibility, interoperability, and reusability (FAIR). The data generation and annotation take two years with multiple domain experts. TomatoMAP includes three subsets, TomatoMAP-Cls, TomatoMAP-Det and TomatoMAP-Seg for 50 BBCH classification, 7 main area detection, and 10 classes instance segmentation for fine-grained phenotyping. The dataset has also unique 3D modeling potential for further research.
If you need any help, submit a ticket via GitHub Issues.
- TomatoMAP dataset is released under CC BY 4.0. Commercial use requires permission.
- TomatoMAP code space is released under Apache 2.0.
- 2025-07-15 For KIDA Conference, arXiv was available
- 2025-07-18 e!DAL dataset DOI is claimed
- 2025-07-23 Code repo was turned to public
- 2025-07-24 Submitted to Nature
- 2026-02-17 Accepted by Nature
- 2026-02-24 e!DAL dataset DOI is published
- 2026-02-24 Code space is optimized from private branch
- Update homepage
- Code for our IoT Datastation
- TomatoMAP Plus (TomatoMAP+), a fancy follow-up project
If you are interested to contribute to our work, please feel free to contact us.
Our code is tested under the following environment details:
- OS: Ubuntu 20.04.6 LTS
- GPU: Tesla V100-PCIE-16GB
- NVIDIA Driver: 575.57.08
- CUDA Toolkit: 12.6
- Python: 3.10.19 (
conda) - PyTorch: 2.4.0
- TorchVision: 0.19.0
For Detectron2 compilation with CUDA 12.6, use gcc/g++ 13 in conda env (newer GCC, e.g. 14, may fail with nvcc host compiler checks).
β€Expand detailsβ€
We suggest using conda for env management.
git clone https://github.com/0YJ/TomatoMAP.git --recursive
cd TomatoMAP
conda env create --file environment.yml
conda activate TomatoMAP
pip install -e submodules/ultralytics/ --no-build-isolation --no-deps
pip install -e submodules/detectron2/ --no-build-isolation --no-deps
We use notebook as TomatoMAP builder.
jupyter notebook
# Then open the notebook, follow our pipeline (you may need to adjust the path based on your system).unzip TomatoMAP dataset you downloaded from our e!DAL repo under repository root:
unzip TomatoMAP.zip
mv TomatoMAP_builder.ipynb TomatoMAPThen follow the guide under TomatoMAP_builder.ipynb to finish the dataset setup. Finally your project folder should look like this:
TomatoMAP/
βββ main.py # Main entry
βββ README.md # Project documentation
βββ environment.yml # Environment definition
βββ configs/
β βββ det/ # Detection configs
β βββ TomatoMAP-Det.yaml
β βββ best_hyperparameters.yaml
βββ src/ # Core source functions
β βββ cls_trainer.py
β βββ det_trainer.py
β βββ det_balanced_trainer.py
β βββ seg_trainer.py
β βββ datasets/
β βββ models/
β βββ utils/
β βββ cls/
β βββ avh/
βββ submodules/ # External dependencies
β βββ ultralytics/
β βββ detectron2/
βββ TomatoMAP/ # Dataset root directory
β βββ TomatoMAP_builder.ipynb # Dataset builder notebook
β βββ metadata # meta data for dataset
β βββ img # raw TomatoMAP data subdivision
β βββ labels # raw ToamtoMAP data label subdivision
β βββ BBCH_classification.xlsx # ToamtoMAP BBCH classification label
β βββ TomatoMAP-Cls/ # Classification subset
β βββ TomatoMAP-Det/ # Detection subset
β βββ TomatoMAP-Seg/ # Segmentation subset
βββ outputs/ # Training outputs (created automatically)
Train a classification model on TomatoMAP-Cls dataset:
# default training with MobileNetV3-Large
python main.py cls --data-dir ./TomatoMAP/TomatoMAP-Cls --epochs 100
# options
python main.py cls \
--data-dir ./TomatoMAP/TomatoMAP-Cls \
--model mobilenet_v3_large \
--epochs 100 \
--batch-size 32 \
--lr 1e-4 \
--img-size 640 640 \
--patience 5 \
--output-dir outputs/cls/experiment1Available models:
mobilenet_v3_large(default)mobilenet_v3_smallmobilenet_v2resnet18
Train a YOLO model on TomatoMAP-Det dataset:
# default training with YOLO11-Large
python main.py det --data-config ./configs/det/TomatoMAP-Det.yaml --epochs 500
# options
python main.py det \
--data-config ./configs/det/TomatoMAP-Det.yaml \
--model yolo11l.pt \
--epochs 500 \
--img-size 640 \
--batch-size 4 \
--patience 10 \
--device 0 \
--output-dir outputs/det/experiment1 \
--hyperparams ./configs/det/best_hyperparameters.yaml
# enable class-balanced weighted sampling
python main.py det \
--data-config ./configs/det/TomatoMAP-Det.yaml \
--model yolo11l.pt \
--balanced-samplingTrain a Mask R-CNN FPN based model on TomatoMAP-Seg dataset:
# training
python main.py seg train \
--data-dir ./TomatoMAP/TomatoMAP-Seg \
--model COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
--epochs 100 \
--lr 0.0001 \
--batch-size 4 \
--patience 5
# evaluation
python main.py seg eval \
--data-dir ./TomatoMAP/TomatoMAP-Seg \
--model-path model_best.pth \
--output-dir outputs/seg
# visualization
python main.py seg vis \
--data-dir ./TomatoMAP/TomatoMAP-Seg \
--model-path model_best.pth \
--n 5 \
--output-dir outputs/seg
# dataset information
python main.py seg info --data-dir ./TomatoMAP/TomatoMAP-Seg
# analyze object size (small, big, middle)
python main.py seg analyze --data-dir ./TomatoMAP/TomatoMAP-SegAvailable models:
COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yamlCOCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yamlCOCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yamlCOCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml
TomatoMAP-Cls/
βββ train/
β βββ BBCH class1/
β β βββ img1.jpg
β β βββ ...
β βββ BBCH class2/
β βββ ...
βββ val/
β βββ ...
βββ test/
βββ ...
TomatoMAP-Det/
βββ images
βββ labels
TomatoMAP-Seg/
βββ images/ # All images
β βββ img1.JPG
β βββ ...
βββ labels/ # All labels in COCO format
βββ isat.yaml # Label and class configuration
βββ img1.json
This project is powered by the de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) and ELIXIR-DE (Research Center JΓΌlich and W-de.NBI-001, W-de.NBI-004, W-de.NBI-008, W-de.NBI-010, W-de.NBI-013, W-de.NBI-014, W-de.NBI-016, W-de.NBI-022), Ultralytics YOLO, Meta Detectron2, ISAT, and LabelStudio. Thanks to JetBrains for supporting us with licenses for their tools.
Like our project? Hit that star button at the top right and be our hero! Weβll serve you more open sauce! π²
If you use TomatoMAP in your research and think our project is useful, please cite:
@article{Zhang2026,
title = {Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping},
author = {Yujie Zhang and Sabine Struckmeyer and Andreas Kolb and Sven Reichardt},
issn = {2052-4463},
issue = {1},
journal = {Sci Data},
month = {2},
pages = {309},
volume = {13},
year = {2026},
doi = {https://doi.org/10.1038/s41597-026-06926-9}
}
@dataset{tomatomap,
title={TomatoMAP: Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping},
author={Yujie Zhang and Sabine Struckmeyer and Andreas Kolb and Sven Reichardt},
journal={e!DAL-Plant Genomics and Phenomics Research Data Repository (PGP)},
year={2025},
doi={https://doi.org/10.5447/ipk/2025/14}
}