CMA-ES-IG for Efficient Preference-Based Exploration of Learned Representation Spaces

This repository contains the code used in the research paper:

Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots

This work introduces CMA-ES-IG, a novel query-generation algorithm designed for efficient exploration of learned representation spaces. CMA-ES-IG is particularly valuable for researchers adapting robot behaviors through interactions with non-expert users. The algorithm generates queries that align with the user's preferences over time, simultaneously selecting queries are intuitive and easy for users to answer.

We provide the code for CMA-ES-IG in the file cmaesig_query_generation.py. We also provide the web interface we used in our experiments to be a resource for other researchers. You will have to supply the code to play a particular behavior ID on your physical robot though :). We have provided dummy interfaces for you!

cmaesig_query_generation.py: Implementation of the CMA-ES-IG query generation algorithm.
cmaes_query_generation.py: Implementation of the CMA-ES query generation baseline algorithm.
preference-learning-from-selection/: Submodule for preference learning library (contains InformationGain query generator implementation).
plot_comparisons.py: Script to visualize simulated results.
simulate_preferences.py: Script to run preference simulation experiments.
requirements.txt: List of Python dependencies.
results/: Contains the cached experimental results from our simulations, including alignment and regret metrics for different algorithms and dimensionality settings.
interface/: Contains the web interface used in our experiments.
- start_interface: Script to launch the web interface.
- static/: Static files for the web interface (e.g., dummy data).
  - dummy_gestures.npy: Dummy robot trajectories.
  - dummy_embeddings.npy: Dummy feature embeddings for trajectories.
- dummy_controller.py: Example script for controlling a robot (currently a dummy implementation).
- preference_engine.py: Backend logic for the preference learning interface.

Installation

Follow these steps to set up the repository:

Clone the repository:

git clone https://github.com/interaction-lab/CMA-ES-IG.git
cd CMA-ES-IG

Initialize submodules:
```
git submodule update --init
```

Set up a Conda environment (recommended):

conda create -n cmaesig python=3.10
conda activate cmaesig

Install dependencies:

pip install -e preference-learning-from-selection
pip install -r requirements.txt

Usage

1. Evaluating Simulated Performance

This repository includes the simulation data and scripts used in our publication.

Visualize Pre-computed Results: To plot the comparison graphs (e.g., alignment vs. iteration) from the paper:
```
python generate_paper_data/plot_comparisons.py
```
Note: To plot 'regret' instead of 'alignment', use the --use-regret flag.

You can also view the parameter sensitivity plot, and print out the table data with:
```
python generate_paper_data/plot_sensitivity_data.py
python generate_paper_data/show_table1_data.py
```
Re-run Simulations: To reproduce the simulation experiments:
```
python simulate_preferences.py --dim <dimension>
```
Replace <dimension> with the desired feature space dimensionality (e.g., 8, 16, 32, as used in the paper).

⚠️ Performance Advisory: The information gain calculation, especially within the infogain and consequently CMA-ES-IG methods, exhibits significant computational cost that scales with the feature dimensionality and the number of items per query. Simulations, particularly for higher dimensions (e.g., 32), may require several hours to complete.

2. Running the User Interface

A demonstration web interface is provided to provide a starting point for other people.

Launch the Interface: From the top-level directory of the repository, run:
```
python interface/start_interface.py
```
Access the Interface: Open a web browser and navigate to http://localhost:8001/study.

3. Integrating with a Physical Robot or Custom System

To adapt this interface for your specific robotic platform or application:

Trajectory and Feature Generation:
- The system requires robot trajectories (raw command sequences) and corresponding feature vectors (embeddings).
- The demo uses pre-computed data stored in interface/static/dummy_gestures.npy (trajectories) and interface/static/dummy_embeddings.npy (features).
- Your Implementation: You must provide mechanisms to:
  - Generate or load your robot's trajectories.
  - Generate or load the corresponding feature vectors for these trajectories. Ensure these features adhere to the normalization assumption (see Technical Considerations).
  - Replace the dummy .npy files or modify the data loading logic within the interface code accordingly. Pre-computation often yields a smoother user experience.
Robot Control Backend:
- The interface communicates with a backend process to execute robot behaviors. The placeholder is interface/dummy_controller.py.
- Your Implementation: Modify or replace dummy_controller.py with code that receives a trajectory ID (or the trajectory itself) and commands your physical robot hardware to execute the corresponding behavior. The current implementation merely prints the action; this needs to be replaced with your robot's specific API calls.
Algorithm Selection:
- The query generation algorithm can be configured in preference_engine.py.
- To switch between implemented algorithms, modify the string assignment near the end of the file (e.g., from 'CMA-ES-IG' to 'CMA-ES' or 'infogain').
- To add custom query generation strategies, follow the structure exemplified in lines 38-48 of preference_engine.py, implementing the required interface for your new algorithm.

Technical Considerations

Feature Space Normalization:
- The underlying preference learning models and the CMA-ES-IG algorithm generally assume that feature vectors are normalized, ideally residing approximately within a unit ball.
- Practical Steps: Ensure your feature extraction process incorporates normalization. Techniques include:
  - Applying L2 weight penalties during representation learning.
  - Utilizing KL-divergence regularization (common in VAEs).
  - Performing post-hoc scaling (e.g., L2 normalization) of generated feature vectors.
Scalability:
- The computational complexity of the information gain objective is sensitive to both the dimensionality of the feature space and the number of items presented per query.
- While CMA-ES-IG offers improved scaling compared to pure information gain optimization, its practical application may become computationally intensive for feature spaces significantly larger than approximately 100 dimensions.
- For very high-dimensional spaces, consider employing dimensionality reduction techniques (e.g., PCA, Autoencoders) prior to applying preference-based optimization.

Contributing

Contributions are welcome! Please follow standard practices: Fork the repository, create a feature branch, and submit a pull request with a clear description of your changes. Adherence to existing coding style is appreciated.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this code or the CMA-ES-IG algorithm in your research, please cite our publication:

@inproceedings{dennler2024improving,
  author    = {Dennler, Nathaniel and Shi, Zhonghao and Nikolaidis, Stefanos and Matari{\'c}, Maja},
  title     = {Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots},
  booktitle = {International Symposium on Robotics Research (ISRR)},
  publisher = {IFRR},
  year      = {2024},
  url       = {https://doi.org/10.48550/arXiv.2411.11182}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
generate_paper_data		generate_paper_data
interface		interface
preference-learning-from-selection @ 758929b		preference-learning-from-selection @ 758929b
results		results
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
cmaes_query_generator.py		cmaes_query_generator.py
cmaesig_query_generator.py		cmaesig_query_generator.py
requirements.txt		requirements.txt
simulate_preferences.py		simulate_preferences.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CMA-ES-IG for Efficient Preference-Based Exploration of Learned Representation Spaces

Contents

Installation

Usage

1. Evaluating Simulated Performance

2. Running the User Interface

3. Integrating with a Physical Robot or Custom System

Technical Considerations

Contributing

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

interaction-lab/CMA-ES-IG

Folders and files

Latest commit

History

Repository files navigation

CMA-ES-IG for Efficient Preference-Based Exploration of Learned Representation Spaces

Contents

Installation

Usage

1. Evaluating Simulated Performance

2. Running the User Interface

3. Integrating with a Physical Robot or Custom System

Technical Considerations

Contributing

License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages