Knowledge-Guided Object Detection via Bayesian Networks and Knowledge Graphs (KGB- NCNet)
Christine Dewi, Rasoul Amirzadeh, Dhananjay Thiruvady, and Nayyar Zaidi Deakin University, VIC 3220, Australia.
This paper has three main contributions:
- We propose a new approach that combines knowledge graphs and BN to assist with object detection, that integrates knowledge graphs with a BN.
- We present a technique for creating task-specific knowledge graphs that accurately represent contextual relationships that are essential to object detection ob- ject detection performance.
- We validate the efficacy of our method by conducting thorough assessments on widely accepted and difficult item detection benchmarks COCO dataset [14]. Our proposed model demonstrates a significant mAP boost, up to 2.02%, relative to the state-of-the-art baseline.
Source: Code for replicating the paper "Object Detection Meets Knowledge Graphs".
datasets.py contains the dataset classes which are made such that both VOC and COCO datasets are processed in the same way.
training.py is used to train the Faster R-CNN model for both the VOC and COCO datasets.
frequency_based.py is used to extract the semantic consistency by using the frequency based method. The resulting matrices for VOC and COCO are stored in a JSON file so that they can be re-used.
knowledge_graph_prep is used to load in the data from the knowledge graph assertion lists and filters them to the english subgraph and filters out negative relations. The resulting graph is converted to integers and both are stored in a new CSV file.
knowledge_graph_based.py is used to extract the semantic consistency from the filtered knowledge graph. The resulting matrices for VOC and COCO are stored in a JSON file so that they can be re-used.
results_voc and results_coco are used to output the mAP and recall (per class and averaged) for the configurations (including matrix type) which can be changed at the top of the files.
results_voc_multiple_runs and results_coco_multiple_runs can be run to compute all results used in the paper in one single run
dataloading, metrics, plotting and testing in Utils contain several functions used to support the main files.
More detailed descriptions are included in the top of each of the files themselves.
We first need to clone this repository, and the random walk with restart repository (which can be found at https://github.com/jinhongjung/pyrwr):
git clone https://github.com/tue-mps/rescience-ijcai2017-230.git
cd "rescience-ijcai2017-230"
git clone https://github.com/jinhongjung/pyrwr.git
Next we need to create the correct Environment from our requirements file and the RWR requirements. (if using Anaconda prompt):
(if you want to create a new environment for this project, I suggest using python 3.7):
conda create -n odmkg_env python=3.7
conda activate odmkg_env
pip3 install -r requirements.txt
cd pyrwr
pip3 install -r requirements.txt
python setup.py install
cd ..
For this application, we use the PASCAL VOC 2007 dataset and the COCO2014 dataset (for object detection) and the ConceptNet 5.7.0 and 5.5.0 knowledge graphs.
The easiest way to gather the datasets is to download it from https://zenodo.org/record/7385730#.Y4ojRHbMIYp. There are 6 files, 5 .pth files that contain the trained models and 1 large zip file (28.7 GB) which contains the raw data. Download all 3 of them.
after cloning the repository, unzip the 'raw data' file (without the extra folder) in your project folder such that your structure looks as follows:
add the trained model (.pth files) into the Trained models folder also as follows:
- Project
- Data raw
- COCO2014
- Annotations
- Image_info_test2014
- Test2014
- Train2014
- Val2014
- ConceptNet
- Assertions55
- Assertions57
- VOC2007
- Annotations
- ImageSets
- JPEGImages
- SegmentationClass
- SegmentationObject
- COCO2014
- Datasets
- Model Training
- Trained models
- coco-FRCNN-resnet50.pth
- coco-FRCNN-vgg16.pth
- voc-FRCNN-resnet50.pth
- voc-FRCNN-resnet18.pth
- voc-FRCNN-vgg16.pth
- Trained models
- Results
- Semantic Consistency
- Utils
- Data raw
The VOC datasets can be downloaded at
http://host.robots.ox.ac.uk/pascal/VOC/voc2007/index.html
under the header ‘development kit’:
Download the training/validation data (450MB tar file)
and under the header ‘test data’:
Download the annotated test data (430MB tar file)
After unpacking the files, it is important to copy the files within Annotations, ImageSets and JPEGImages into one ‘superfolder’, such that there is only 1 folder containing all information for training, validation and testing. So in the Annotations folder there will be all xml files, in the JPEGImages folder there will be all images. In the ImageSets folder, in the Layout folder there will be all (test, train, trainval and val) text files. In the ImageSets folder, in the Main folder will also be all text files of the train test and validation sets. And the same goes for the ImageSets, Segmenation folder.
The COCO datasets can be downloaded at
https://cocodataset.org/#download
download the following folders:
- 2014 Train images [83K/13GB]
- 2014 Val images [41K/6GB]
- 2014 Test images [41K/6GB]
- 2014 Train/Val annotations [241MB]
- 2014 Testing Image info [1MB]
Unpack these folders and add them together in a COCO2014 folder.
The assertion lists (knowledge graphs) can be downloaded at
https://github.com/commonsense/conceptnet5/wiki/Downloads
for this approach versions 5.7.0 and 5.5.0 are used.
make sure the raw data folder is structured correctly:
- Project
- Data raw
- COCO2014
- Annotations
- Image_info_test2014
- Test2014
- Train2014
- Val2014
- ConceptNet
- Assertions55
- Assertions57
- VOC2007
- Annotations
- ImageSets
- JPEGImages
- SegmentationClass
- SegmentationObject
- COCO2014
- Datasets
- Model Training
- Trained models
- coco-FRCNN-resnet50.pth
- coco-FRCNN-vgg16.pth
- voc-FRCNN-resnet50.pth
- voc-FRCNN-resnet18.pth
- voc-FRCNN-vgg16.pth
- Trained models
- Results
- Semantic Consistency
- Utils
- Data raw
Once the repositories are installed, the environment is correct and the raw data and trained models are put in the correct locations, the code can be used.
It is recommended to use GPU (cuda), as cpu only has not been developed/tested for.
To recreate the results from the paper you can just run the initial configuration which includes the ConceptNet consistency matrix. You can run the script from the prompt as follows:
python -m Results.results_voc_multiple_runs or
python -m Results.results_coco_multiple_runs
Or if you want to select specific configurations, you can use:
python -m Results.results_voc or
python -m Results.results_coco
If you want to train a model for yourself, training.py in the Model Training folders should be run first.
To (re-)obtain the semantic consistency matrix for the frequency based method, frequency_based.py in the semantic consistency folder should be run. This will store the matrix information in the folder. For the knowledge graph based method, first knowledge_graph_prep.py should be run. this will filter the raw data graph. Next knowledge_graph_based.py will produce and store the matrix. Since the matrices are stored, these have only to be run once. However, they are included in the files, so if you just want to reproduce the results from the paper, you don't have to run these.
To recreate the results from the paper, you can run either or both results_voc and results_coco in the Results folder. In the top of the file you can change the configurations to select the correct matrix method to use and then this will output the mAP and recall.
results_voc and results_coco start with a couple of configuration lines. Initially for the VOC set they are as follows:
matrix_type = 'KG-CNet-55-VOC' # choose between 'KF-All-VOC', 'KF-500-VOC', 'KG-CNet-57-VOC' or 'KG-CNet-55-VOC'
detections_per_image = 500 # the maximum number of detected objects per image
num_iterations = 10 # number of iterations to calculate p_hat
box_score_threshold = 1e-5 # minimum score for a bounding box to be kept as detection (default = 0.05)
bk = 5 # number of neighbouring bounding boxes to consider for p_hat
lk = 5 # number of largest semantic consistent classes to consider for p_hat
epsilon = 0.9 # trade-off parameter for traditional detections and knowledge aware detections
topk = 100 # maximum number of detections to be considered for metrics (recall@k / mAP@k)
They can be changed, and by running the script again, different results will be computed.
'Object Detection meets Knowledge Graphs' by Fang et al. describes a framework which integrates external knowledge such as knowledge graphs into object detection. They apply two different approaches to quantify background knowledge as semantic consistency. An existing object detection algorithm is re-optimized with this knowledge to get updated knowledge-aware detections. The authors claim that this framework can be applied to any existing object detection algorithm and that this approach can increase recall, while maintaining mean Average Precision (mAP). In this report, the framework is implemented and the experiments are conducted as described in the paper, such that the claims can be validated.
The authors describe a framework where a frequency based approach and a knowledge graph based approach are used to determine semantic consistency. A knowledge-aware re-optimization function updates the detections of an existing object detection algorithm. Both the baseline and its re-optimized correlates are evaluated on two benchmark datasets, namely PASCAL VOC 2007 and MS COCO 2014. The replication of the experiments was completed using the information in the paper and clarifications of the author, as no source code was available. The framework was implemented in PyTorch and evaluated on the same benchmark datasets.
We were able to successfully implement the framework from scratch as described. We have bench‐marked the developed framework on two datasets and replicated the results of all matrices as described. The claim of the authors can not be confirmed for either of the described approaches. The results either showed an increase of recall at the cost of a decrease in mAP, or a maintained mAP, without an improvement in recall. Three different backbone models show similar behavior after re‐optimization, concluding that the knowledge‐aware re‐optimization does not benefit object detection algorithms.
The methodology was well described and easy to understand conceptually.
There was no source code available, which made it difficult to understand some technicalities of the implementation. The authors failed to mention a number of crucial details and assumptions of this implementation in the paper, which are essential for reproducing the methodology without making fundamental assumptions.
We contacted the authors to elaborate on missing details, however no contact was found with the contact-information on the paper. Fortunately, a different email address of Yuan Fang was found online, to which he did respond fast and with clear explanations, for which we would like to express our gratitude.
Expert Systems with Applications 2025 Knowledge-guided object detection via Bayesian networks and knowledge graphs (KGBNCNet) https://doi.org/10.1016/j.eswa.2025.129385
@article{dewi2025knowledge,
title={Knowledge-Guided Object Detection via Bayesian Networks and Knowledge Graphs (KGBNCNet)},
author={Dewi, Christine and Amirzadeh, Rasoul and Thiruvady, Dhananjay and Zaidi, Nayyar},
journal={Expert Systems with Applications},
pages={129385},
year={2025},
publisher={Elsevier} }
@article{DEWI2026129385, title = {Knowledge-guided object detection via Bayesian networks and knowledge graphs (KGBNCNet)}, journal = {Expert Systems with Applications}, volume = {297}, pages = {129385}, year = {2026}, issn = {0957-4174}, doi = {https://doi.org/10.1016/j.eswa.2025.129385}, url = {https://www.sciencedirect.com/science/article/pii/S0957417425029975}, author = {Christine Dewi and Rasoul Amirzadeh and Dhananjay Thiruvady and Nayyar Zaidi}, keywords = {Knowledge-guided, Bayesian networks, Object detection, Knowledge graph,}, abstract = {Object detection is a crucial problem in computer vision that integrates image categorization and localization to recognize objects within an image or video. It serves a pivotal function in numerous practical applications. However, existing state-of-the-art object detection algorithms largely ignore the vast quantity of background knowledge about the real world to detect small objects, including deep neural networks, which only focus on utilizing features within an image. An excessive number of items in an image always leads to occlusion, target blur, and other complications, making object detection more challenging. The detection problem is solely addressed in terms of feature extraction in object detection using a convolutional neural networks, which ignore knowledge and experience to investigate higher-level relationships between objects. This research introduces a novel approach that integrates a knowledge graph with Bayesian networks to improve the accuracy of standard object detection algorithms, as Bayesian networks provide a structured way to incorporate prior knowledge and model the probabilistic relationships between objects. The proposed framework can seamlessly incorporate any external knowledge and object detection system. This integration enhances object detection by undergoing a re-optimization process to achieve improved consistency with background knowledge. Finally, our proposed method was empirically evaluated on COCO benchmark datasets. The results demonstrate a significant boost in mAP, up to 2.02 %, relative to the state-of-the-art baseline.} }


