Watch the 3-minute pitch video here
To refer to this work, please cite the following paper:
@INPROCEEDINGS{goerner24,
author={Görner, Michael and Hendrich, Norman and Zhang, Jianwei},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
title={Pluck and Play: Self-supervised Exploration of Chordophones for Robotic Playing},
year={2024},
pages={18286-18293},
doi={10.1109/ICRA57147.2024.10610120}
}
The datasets associated with this paper can be found here.
- ROS Noetic (or 2024 ROS-O builds), see package.xml for most ROS dependencies
- The TAMS PR2 stack
music_perceptionload_venv(and various packages fromrequirements.txt)
-
launch and calibrate regular PR2
-
tape plectra to fingertips
-
move Guzheng in front of PR2, connect microphone through USB soundcard to basestation
-
mountpr2.shon basestation to share workspace between c1 and basestation -
launch
all.launch(which launches nodes on both basestation and c1) -
launch
rviz.launchon basestation for a pre-setup visualization -
run
rosrun tams_pr2_guzheng cli. It provides a command line with various commands to control the framework (seehelpfor more information). Everything can be done outside the CLI as well with finer granularity, but it's a useful entry point.
- move MoveIt joint model group
manipulationto named stateguzheng_initial,cli: goto initial - activate mannequin mode,
cli: mannequin- very optional: set the head back to default position controller via
hold_head_still.shon basestation to keep it fixed
- very optional: set the head back to default position controller via
- teach-in string plucking until all strings appear roughly in the right location
- disable mannequin mode & go back to
guzheng_initial,cli: mannequin off&cli: goto initial
throughout geometry exploration, optionally run rqt_reconfigure on fingertips, guzheng, and plectrum_poses namespaces
to adjust detection thresholds, clock offsets, Cartesian plectrum poses, and string reconstruction options as dynamic calibration steps.
- explore geometry of demonstrated strings, e.g.,
cli: explore_geometry [a4 fis5 ...] - notice that you have to confirm trajectory execution at first in the RvizVisualToolsGui as breakpoints are added before actual execution. Confirming with
continuewill drop further questions. - Eventually fix strings
cli: fix_stringsonce you are happy with the current result, this will automatically...- disable string fitter (dynamic reconfigure
activeflag),cli: fit_strings off - optionally store current geometry (
guzheng/string_fitter/store_to_fileservice),cli: store_strings_to_file - clear the stored dynamics database
- disable string fitter (dynamic reconfigure
- Explore Dynamics through Active Valid Pluck Exploration,
cli: explore_dynamicsor, e.g.,cli: explore_dynamics 1.0 d6 d5 d4to restrict to specific strings and inward (1.0) or outward (-1.0) direction
- After exploration, make the gathered plucks available for playing,
cli: use_explored_pluckscareful, this will overwrite the current plucks db if it exists
- run repeat after me demo node that listens for note onsets from the microphone and will try to imitate melodies,
cli: repeat_after_me
OR
- start the module that receives pieces (
music_perception/Piece) to play and builds/executes plucking paths,cli: start_play_piece - play note sequences,
cli: play a4 fis5 d6(each optionally followed with:loudnessin range 1-127)
/joint_states- current joint readings for PR2 & Shadow Hand/hand/rh/tactile- BioTac readings/tf- Transforms/tf_static/diagnostics_agg- Diagnostics system (useful to detect runtime faults)/mannequin_mode_active- Is mannequin mode active? (if it is, the robot cannot move by itself)
(tf already includes plectrum/fingertip positions and detected string frames)
/guzheng/audio- unused/guzheng/audio_stamped- time-stamped audio, depending on the publisher audio is ros::Time audio or audio pipeline time (drifts over time)/guzheng/audio_info- meta data (1 constant latched message)
/move_group/monitored_planning_scene- MoveIt's world model/execute_trajectory/goal- MoveIt's Trajectory Execution action (which splits trajectories for hand/arm controller and sends them on)/execute_trajectory/result
-
/run_episode/goal- generate, execute, and analyze a single pluck (including approach motion) -
/run_episode/result -
/episode/state- "start"/"end" before/after path is sent to /pluck/pluck action -
/episode/action_parameters- selected and executed parameters for single episode pluck -
/pluck/execute_path/goal- generate and execute a generic Cartesian trajectory with a target frame -
/pluck/execute_path/result -
/pluck/pluck/goal- same asexecute_pathbut provides the following additional debugging output/data collection -
/pluck/pluck/result -
/pluck/commanded_path- Cartesian path to execute in pluck action -
/pluck/planned_path- path from generated joint trajectory -
/pluck/executed_path- eventually executed path -
/pluck/projected_img- image summarizing the three paths in 2d string space -
/pluck/trajectory- generated Trajectory -
/pluck/executed_trajectory- recorded trajectory execution -
/pluck/active_finger- current finger used in /pluck action (used for projection) -
/pluck/keypoint- keypoint of the ruckig parameterization selected inrun_episode -
/fingertips/plucks- detected plucking events -
/fingertips/plucks_projected- all projected plucks -
/fingertips/plucks_latest- latest projected pluck only for visualization -
/fingertips/pluck_detector/signal- thresholding signal -
/fingertips/pluck_detector/detection- high/low signal to debug signal processing -
/fingertips/pluck_detector/parameter_descriptions- dynamic reconfigure for threshold -
/fingertips/pluck_detector/parameter_updates -
/fingertips/pluck_projector/parameter_descriptions- dynamic reconfigure for pluck projection -
/fingertips/pluck_projector/parameter_updates -
/guzheng/onsets- currently detected NoteOnsets -
/guzheng/onsets_markers- Markers generated from onsets -
/guzheng/onsets_projected- all onsets projected according to current parameters -
/guzheng/onsets_latest- latest onsets projected for visualization -
/guzheng/cqt- cqt generated as a side-product bydetect_onset -
/guzheng/onset_detector/envelope- envelope used to extract peaks as maxima -
/guzheng/onset_detector/compute_time- debugging topic to measure computation time -
/guzheng/onset_detector/drift- drift compensation for audio input (onsets are shifted by the value) -
/guzheng/spectrogram- image visualization of cqt -
/guzheng/onset_projector/parameter_descriptions- dynamic reconfigure for onset projection -
/guzheng/onset_projector/parameter_updates -
/guzheng/events- unused (alternative projector input) -
/fingertips/events- unused (alternative projector input) -
/guzheng/estimate- current estimate of strings -
/guzheng/estimate_markers-visual_markersvisualization with additional (possibly rejected) candidates -
/explore/p- shaped exploration distribution used to sample which strings to explore next -
/explore/sample_H- MC sample visualization to determine NBP for string -
/explore/loudness_strips- overview of all observed valid loudness samples across strings -
/explore/episodes_loudness- observed loudness for all valid samples for string -
/explore/gp_loudness- visualization of Gaussian Process from samples for string -
/explore/gp_std_loudness- stddev of the same GP -
/explore/episodes_validity_score- validity label for all samples for string -
/explore/p_validity- visualization of GP probit-regression for validity for string -
/play_piece/action- takesmusic_perception/Piece, infers pluck parameters, generate paths, and execute throughexecute_path -
/play_piece/piece- message interface for action goals -
/play_piece/piece_midi_loudness- same, but loudness scaled between 1-127 for each onset interpreted relative to explored plucks -
/play_piece/expressive_range- summary information of known notes/loudness ranges
target_pluck_string- a dynamic frame published whenrun_episodeattempts to target a stringrh_{finger}_plectrum- tip of the plectrum as calibrated
