Skip to content

ntnu-arl/reasoning_hydra

Repository files navigation

Relationship-Aware Hierarchical 3D Scene Graph

Webpage arXiv Dataset YouTube Zenodo DOI

License: MIT ROS Version

This package implements an enhanced hierarchical 3D scene graph based on Hydra, integrating open-vocabulary features for rooms and objects, and supporting object-relational reasoning.

We leverage a Vision-Language Model (VLM) to infer semantic relationships. Additionally, we introduce a task reasoning module that combines Large Language Models (LLM) and a VLM to interpret the scene graph’s semantic and relational information, enabling agents to reason about tasks and interact with their environment intelligently.

Demo Scene Graph

Table of Contents


Setup

General Requirements

These instructions assume that ros-noetic-desktop-full is installed on Ubuntu 20.04.

Install general dependencies:

sudo apt install python3-rosdep python3-catkin-tools python3-vcstool

Building

Build the repository in Release mode:

mkdir -p catkin_ws/src
cd catkin_ws
catkin init
catkin config -DCMAKE_BUILD_TYPE=Release

cd src
git clone git@github.com:ntnu-arl/reasoning_hydra.git
vcs import . < reasoning_hydra/install/packages.repos
rosdep install --from-paths . --ignore-src -r -y

cd ..
catkin build

Python Environment for Semantics and Reasoning

Follow the instructions in semantic_inference_ros to set up the Python environment required to run the semantic and reasoning models.


Usage

Scene Graph Construction

The system supports multiple datasets and online deployment on robots with GPU capabilities (e.g., Nvidia Jetson Orin AGX).

Uhumans2

Download rosbags from Uhumans2 dataset.

Start the scene graph:

roslaunch hydra_ros uhumans2.launch

In a separate terminal, play the rosbag:

rosbag play path/to/rosbag

Replica

Follow NICE-SLAM instructions to download posed RGB-D data from Replica scenes.

Run the scene graph:

roslaunch hydra_ros replica.launch

Publish the data:

roslaunch hydra_ros publish_replica.launch dataset_path:=<Path to your replica dataset> scene_name:=<Scene name>

Habitat-Matterport 3D Semantics Dataset

Follow HOV-SG instructions (Step 2 can be skipped) to download posed RGB-D data from several scenes.

Run the scene graph:

roslaunch hydra_ros hm3dsem.launch
roslaunch hydra_ros publish_hm3dsem.launch dataset_path:=<Path to hm3d_trajectories> scene_name:=<Scene name>

Robot Deployment

To run the scene graph on your robot:

  • Robot must provide posed RGB-D data as sensor_msgs/Image
  • Pose must be provided via TFs

Update robot.launch with the correct TFs and camera topic names, then run:

roslaunch hydra_ros robot.launch

We provide recorded data from experiments with an ANYMal robot. Download it here.

To use this data:

roslaunch hydra_ros robot.launch playback_mode:=True

Then play one of the downloaded rosbags:

rosbag play <bag_to_play> --topics /tf /camera/aligned_depth_to_color/image_raw/compressedDepth /camera/color/camera_info /camera/color/image_raw/compressed --clock

Task Reasoning

The reasoning module (VLM + LLMs) requires an internet connection.

  • LLM queries are done via OpenAI API.
  • A large VLM is hosted externally (setup instructions: semantic_inference_ros)

IMPORTANT: When using the reasoning module, set your OpenAI and FastAPI (see https://github.com/ntnu-arl/semantic_inference_ros) keys as environment variables before launching the ROS nodes:

export OPENAI_API_KEY=<Your OpenAI API Key>
export FASTAPI_API_KEY=<Your server FastAPI Key>

Once the scene graph is constructed, either:

  1. Use the provided rviz GUI to interact with the service and visualize task reasoning results on the scene graph.

  2. Or call the ROS service: /semantic_inference/navigation_prompt_service/navigation_prompt


Citation

If you use this work in your research, please cite:

@inproceedings{puigjaner2026reasoninggraph,
    title={Relationship-Aware Hierarchical 3D Scene Graph},
    author={Gassol Puigjaner, Albert and Zacharia, Angelos and Alexis, Kostas},
    booktitle={2026 IEEE International Conference on Robotics and Automation (ICRA)}, 
    year={2026}
}

License

Released under BSD-3-Clause.


Acknowledgements

This open-source release is based on work supported by the European Commission through:

  • Project SYNERGISE, under Horizon Europe Grant Agreement No. 101121321

Contact

For questions or support, reach out via GitHub Issues or contact the authors directly:

About

Relationship-aware hierarchical 3D scene graph for task reasoning

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages