ICRA

Guanya Shi · Guanya Shi · commit 0a3258560935 · 2026-02-04T01:27:00.000-08:00
diff --git a/index.html b/index.html
@@ -109,7 +109,7 @@
     <tr>
         <td width="100%" valign="middle">
             <ul>
-            <li>2026-01-26: Four papers accepted to ICLR 2026 and two to L4DC 2026.</li>
+            <li>2026-01-31: Four papers accepted to ICLR 2026, two to L4DC 2026, and six to ICRA 2026.</li>
             <li>2025-10-21: Congratulations to Zeji for receiving the <a href="https://www.cs.cmu.edu/news/2025/amazon-phd-fellows" target="_blank">Amazon AI Ph.D. Fellowship</a>, and Wenli for receiving the CMU RI Presidential Fellowship!</li>
             <li>2025-08-01: <a href="https://lecar-lab.github.io/spi-active_/" target="_blank">SPI-Active</a> (oral), <a href="https://lecar-lab.github.io/SoFTA/" target="_blank">HoldMyBeer</a>, and <a href="https://human-as-robot.github.io/" target="_blank">Humanoid~Human</a> accepted to CoRL 2025.</li>
             <li>2025-04-11: <a href="https://agile.human2humanoid.com/" target="_blank">ASAP</a> and <a href="https://xiaofeng-guo.github.io/flying_hand.io/" target="_blank">Flying Hand</a> accepted to RSS 2025.</li>
diff --git a/publications.html b/publications.html
@@ -16,7 +16,7 @@
     <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css">
     <style>
         :root {
-            --highlight-color: rgba(255, 249, 196, 0.79)
+            --highlight-color: rgba(255, 249, 196, 0.)
         }
     </style>
 </head>
@@ -57,8 +57,8 @@
     <tr>
         <td width="100%" valign="middle">
             <p> 
-                <!-- Check out the <a href="https://scholar.google.com/citations?user=joR1Z4UAAAAJ&hl=en&oi=ao" target="_blank">Google Scholar</a> page for a full and up-to-date publication list.  -->
-                <sup>*</sup> denotes equal contribution and <sup>&dagger;</sup> denotes equal advising. Representative papers are <span style="background-color: var(--highlight-color)">highlighted</span>.
+                <sup>*</sup> denotes equal contribution and <sup>&dagger;</sup> denotes equal advising. Below are selected papers. The full publication list is <a href="https://scholar.google.com/citations?hl=en&user=joR1Z4UAAAAJ&view_op=list_works&sortby=pubdate" target="_blank">here</a>.
+                <!-- Representative papers are <span style="background-color: var(--highlight-color)">highlighted</span>. -->
             </p>
         </td>
     </tr>
@@ -74,6 +74,27 @@
     </tr>
     </table>
 
+    <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
+    <tr>
+        <td style="width:35%; vertical-align:middle; padding-right: 20px;">
+            <div class="image-container">
+                <img src='publications/2026_RPL.gif' width="100%">
+            </div>
+        </td>
+        <td style="width:65%; vertical-align:middle">
+            <papertitle>RPL: Learning Robust Humanoid Perceptive Locomotion on Challenging Terrains</papertitle>
+            <br>
+            Yuanhang Zhang, Younggyo Seo, Juyue Chen, Yifu Yuan, Koushil Sreenath, Pieter Abbeel<sup>&dagger;</sup>, Carmelo Sferrazza<sup>&dagger;</sup>, Karen Liu<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>
+            <br>
+            <a href="https://arxiv.org/abs/2602.03002" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
+            <a href="https://rpl-humanoid.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a>
+            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A single policy trained by RPL enables multi-directional robust humanoid locomotion over various challenging terrains.
+        </td>
+    </tr>
+    </table>
+
+    <br>
+
     <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
     <tr>
         <td style="width:35%; vertical-align:middle; padding-right: 20px;">
@@ -164,18 +185,17 @@
     <tr>
         <td style="width:35%; vertical-align:middle; padding-right: 20px;">
             <div class="image-container">
-                <img src='publications/2025_TWIST2.gif' width="90%">
+                <img src='publications/2025_ResMimic.gif' width="90%">
             </div>
         </td>
         <td style="width:65%; vertical-align:middle">
-            <papertitle>TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System</papertitle>
+            <papertitle>ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning</papertitle>
             <br>
-            Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>, Pieter Abbeel<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>, Jiajun Wu<sup>&dagger;</sup>, C. Karen Liu<sup>&dagger;</sup>
+            Siheng Zhao, Yanjie Ze, Yue Wang, C. Karen Liu<sup>&dagger;</sup>, Pieter Abbeel<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>
             <br>
-            <a href="https://arxiv.org/abs/2511.02832" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
-            <a href="https://yanjieze.com/TWIST2/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
-            <a href="https://github.com/amazon-far/TWIST2" target="_blank"><i class="fas fa-code"></i> code</a> 
-            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: TWIST2 is a portable whole-body humanoid teleoperation system that enables scalable data collection.
+            <a href="https://www.arxiv.org/abs/2510.05070" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
+            <a href="https://resmimic.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> 
+            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A two-stage humanoid skill learning framework that learns a residual policy on top of a pre-trained general motion tracking policy.
         </td>
     </tr>
     </table>
@@ -186,23 +206,55 @@
     <tr>
         <td style="width:35%; vertical-align:middle; padding-right: 20px;">
             <div class="image-container">
-                <img src='publications/2025_ResMimic.gif' width="90%">
+                <img src='publications/2025_HDMI.gif' width="90%">
             </div>
         </td>
         <td style="width:65%; vertical-align:middle">
-            <papertitle>ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning</papertitle>
+            <papertitle>HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos</papertitle>
             <br>
-            Siheng Zhao, Yanjie Ze, Yue Wang, C. Karen Liu<sup>&dagger;</sup>, Pieter Abbeel<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>
+            Haoyang Weng, Yitang Li, Nikhil Sobanbabu, Zihan Wang, Zhengyi Luo, Tairan He, Deva Ramanan, Guanya Shi
             <br>
-            <a href="https://www.arxiv.org/abs/2510.05070" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
-            <a href="https://resmimic.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> 
-            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A two-stage humanoid skill learning framework that learns a residual policy on top of a pre-trained general motion tracking policy.
+            <a href="https://arxiv.org/abs/2509.16757" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
+            <a href="https://hdmi-humanoid.github.io/#/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
+            <a href="https://github.com/LeCAR-Lab/HDMI" target="_blank"><i class="fas fa-code"></i> code</a> 
+            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: From human videos, HDMI learns robust humanoid loco-manipulation skills (e.g., opening a door continuously for 67 times). 
         </td>
     </tr>
     </table>
 
     <br>
 
+    <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
+    <tr>
+        <td style="width:35%; vertical-align:middle; padding-right: 20px;">
+            <div class="image-container">
+                <img src='publications/2025_LeVERB.gif' width="90%">
+            </div>
+        </td>
+        <td style="width:65%; vertical-align:middle">
+            <papertitle>LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction</papertitle>
+            <br>
+            Haoru Xue<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Dantong Niu<sup>*</sup>, Qiayuan Liao<sup>*</sup>, Thomas Kragerud, Jan Tommy Gravdahl, Xue Bin Peng, Guanya Shi, Trevor Darrell, Koushil Screenath, Shankar Sastry
+            <br>
+            <a href="https://arxiv.org/abs/2506.13751" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
+            <a href="https://ember-lab-berkeley.github.io/LeVERB-Website/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
+            <a href="https://huggingface.co/datasets/ember-lab-berkeley/LeVERB-Bench-Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> 
+            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: LeVERB is a whole-body humanoid VLA, via learning an expressive and executable latent action vocabulary to bridge system 1 and system 2.
+        </td>
+    </tr>
+    </table>
+
+    <br> 
+    <br>  
+
+    <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
+    <tr>
+        <td width="100%" valign="middle">
+            <heading>2026</heading>
+        </td>
+    </tr>
+    </table>
+
     <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
     <tr style="background-color: var(--highlight-color)">
         <td style="width:35%; vertical-align:middle; padding-right: 20px;">
@@ -215,6 +267,8 @@
             <br>
             Harsh Gupta, Xiaofeng Guo, Huy Ha, Chuer Pan, Muqing Cao, Dongjae Lee, Sebastian Sherer, Shuran Song, Guanya Shi
             <br>
+            <em>International Conference on Robotics and Automation (ICRA)</em>, 2026
+            <br>
             <a href="https://arxiv.org/abs/2510.02614" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
             <a href="https://umi-on-air.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a>
             <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: EADP steers UMI's embodiment-agnostic diffusion policy using the gradient of the low-level controller's tracking cost for cross-embodiment.
@@ -236,6 +290,8 @@
             <br>
             Lars Ankile, Zhenyu Jiang, Rocky Duan, Guanya Shi, Pieter Abbeel, Anusha Nagabandi
             <br>
+            <em>International Conference on Robotics and Automation (ICRA)</em>, 2026
+            <br>
             <a href="https://arxiv.org/abs/2509.19301" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
             <a href="https://residual-offpolicy-rl.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> 
             <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: ResFiT is a sample-efficient residual RL method that performs real-world RL directly on a wheeled humanoid with two dex hands.
@@ -257,6 +313,8 @@
             <br>
             Lujie Yang<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Zhen Wu<sup>*</sup>, Angjoo Kanazawa<sup>&dagger;</sup>, Pieter Abbeel<sup>&dagger;</sup>, Carmelo Sferrazza<sup>&dagger;</sup>, C. Karen Liu<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>
             <br>
+            <em>International Conference on Robotics and Automation (ICRA)</em>, 2026
+            <br>
             <a href="https://arxiv.org/abs/2509.26633" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
             <a href="https://omniretarget.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
             <a href="https://huggingface.co/datasets/omniretarget/OmniRetarget_Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> 
@@ -271,54 +329,25 @@
     <tr>
         <td style="width:35%; vertical-align:middle; padding-right: 20px;">
             <div class="image-container">
-                <img src='publications/2025_HDMI.gif' width="90%">
+                <img src='publications/2025_TWIST2.gif' width="90%">
             </div>
         </td>
         <td style="width:65%; vertical-align:middle">
-            <papertitle>HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos</papertitle>
-            <br>
-            Haoyang Weng, Yitang Li, Nikhil Sobanbabu, Zihan Wang, Zhengyi Luo, Tairan He, Deva Ramanan, Guanya Shi
+            <papertitle>TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System</papertitle>
             <br>
-            <a href="https://arxiv.org/abs/2509.16757" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
-            <a href="https://hdmi-humanoid.github.io/#/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
-            <a href="https://github.com/LeCAR-Lab/HDMI" target="_blank"><i class="fas fa-code"></i> code</a> 
-            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: From human videos, HDMI learns robust humanoid loco-manipulation skills (e.g., opening a door continuously for 67 times). 
-        </td>
-    </tr>
-    </table>
-
-    <br>
-
-    <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
-    <tr>
-        <td style="width:35%; vertical-align:middle; padding-right: 20px;">
-            <div class="image-container">
-                <img src='publications/2025_LeVERB.gif' width="90%">
-            </div>
-        </td>
-        <td style="width:65%; vertical-align:middle">
-            <papertitle>LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction</papertitle>
+            Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa<sup>&dagger;</sup>, Rocky Duan<sup>&dagger;</sup>, Pieter Abbeel<sup>&dagger;</sup>, Guanya Shi<sup>&dagger;</sup>, Jiajun Wu<sup>&dagger;</sup>, C. Karen Liu<sup>&dagger;</sup>
             <br>
-            Haoru Xue<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Dantong Niu<sup>*</sup>, Qiayuan Liao<sup>*</sup>, Thomas Kragerud, Jan Tommy Gravdahl, Xue Bin Peng, Guanya Shi, Trevor Darrell, Koushil Screenath, Shankar Sastry
+            <em>International Conference on Robotics and Automation (ICRA)</em>, 2026
             <br>
-            <a href="https://arxiv.org/abs/2506.13751" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
-            <a href="https://ember-lab-berkeley.github.io/LeVERB-Website/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
-            <a href="https://huggingface.co/datasets/ember-lab-berkeley/LeVERB-Bench-Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> 
-            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: LeVERB is a whole-body humanoid VLA, via learning an expressive and executable latent action vocabulary to bridge system 1 and system 2.
+            <a href="https://arxiv.org/abs/2511.02832" target="_blank"><i class="far fa-file"></i> paper</a> &nbsp
+            <a href="https://yanjieze.com/TWIST2/" target="_blank"><i class="fas fa-globe"></i> website</a> &nbsp
+            <a href="https://github.com/amazon-far/TWIST2" target="_blank"><i class="fas fa-code"></i> code</a> 
+            <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: TWIST2 is a portable whole-body humanoid teleoperation system that enables scalable data collection.
         </td>
     </tr>
     </table>
 
-    <br> 
-    <br>  
-
-    <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
-    <tr>
-        <td width="100%" valign="middle">
-            <heading>2026</heading>
-        </td>
-    </tr>
-    </table>
+    <br>
 
     <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
     <tr style="background-color: var(--highlight-color)">
@@ -1693,7 +1722,7 @@
     </tr>
     </table>
 
-    <br>
+<!--     <br>
     <br>
 
     <table width="880" border="0" align="center" cellspacing="0" cellpadding="0">
@@ -1721,7 +1750,7 @@
             </p>
         </td>
     </tr>
-    </table>
+    </table> -->
 
     <br>
     <br>
diff --git a/publications/2026_RPL.gif b/publications/2026_RPL.gif