|
16 | 16 | <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"> |
17 | 17 | <style> |
18 | 18 | :root { |
19 | | - --highlight-color: rgba(255, 249, 196, 0.79) |
| 19 | + --highlight-color: rgba(255, 249, 196, 0.) |
20 | 20 | } |
21 | 21 | </style> |
22 | 22 | </head> |
|
57 | 57 | <tr> |
58 | 58 | <td width="100%" valign="middle"> |
59 | 59 | <p> |
60 | | - <!-- Check out the <a href="https://scholar.google.com/citations?user=joR1Z4UAAAAJ&hl=en&oi=ao" target="_blank">Google Scholar</a> page for a full and up-to-date publication list. --> |
61 | | - <sup>*</sup> denotes equal contribution and <sup>†</sup> denotes equal advising. Representative papers are <span style="background-color: var(--highlight-color)">highlighted</span>. |
| 60 | + <sup>*</sup> denotes equal contribution and <sup>†</sup> denotes equal advising. Below are selected papers. The full publication list is <a href="https://scholar.google.com/citations?hl=en&user=joR1Z4UAAAAJ&view_op=list_works&sortby=pubdate" target="_blank">here</a>. |
| 61 | + <!-- Representative papers are <span style="background-color: var(--highlight-color)">highlighted</span>. --> |
62 | 62 | </p> |
63 | 63 | </td> |
64 | 64 | </tr> |
|
74 | 74 | </tr> |
75 | 75 | </table> |
76 | 76 |
|
| 77 | + <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
| 78 | + <tr> |
| 79 | + <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
| 80 | + <div class="image-container"> |
| 81 | + <img src='publications/2026_RPL.gif' width="100%"> |
| 82 | + </div> |
| 83 | + </td> |
| 84 | + <td style="width:65%; vertical-align:middle"> |
| 85 | + <papertitle>RPL: Learning Robust Humanoid Perceptive Locomotion on Challenging Terrains</papertitle> |
| 86 | + <br> |
| 87 | + Yuanhang Zhang, Younggyo Seo, Juyue Chen, Yifu Yuan, Koushil Sreenath, Pieter Abbeel<sup>†</sup>, Carmelo Sferrazza<sup>†</sup>, Karen Liu<sup>†</sup>, Rocky Duan<sup>†</sup>, Guanya Shi<sup>†</sup> |
| 88 | + <br> |
| 89 | + <a href="https://arxiv.org/abs/2602.03002" target="_blank"><i class="far fa-file"></i> paper</a>   |
| 90 | + <a href="https://rpl-humanoid.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> |
| 91 | + <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A single policy trained by RPL enables multi-directional robust humanoid locomotion over various challenging terrains. |
| 92 | + </td> |
| 93 | + </tr> |
| 94 | + </table> |
| 95 | + |
| 96 | + <br> |
| 97 | + |
77 | 98 | <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
78 | 99 | <tr> |
79 | 100 | <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
|
164 | 185 | <tr> |
165 | 186 | <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
166 | 187 | <div class="image-container"> |
167 | | - <img src='publications/2025_TWIST2.gif' width="90%"> |
| 188 | + <img src='publications/2025_ResMimic.gif' width="90%"> |
168 | 189 | </div> |
169 | 190 | </td> |
170 | 191 | <td style="width:65%; vertical-align:middle"> |
171 | | - <papertitle>TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System</papertitle> |
| 192 | + <papertitle>ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning</papertitle> |
172 | 193 | <br> |
173 | | - Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa<sup>†</sup>, Rocky Duan<sup>†</sup>, Pieter Abbeel<sup>†</sup>, Guanya Shi<sup>†</sup>, Jiajun Wu<sup>†</sup>, C. Karen Liu<sup>†</sup> |
| 194 | + Siheng Zhao, Yanjie Ze, Yue Wang, C. Karen Liu<sup>†</sup>, Pieter Abbeel<sup>†</sup>, Guanya Shi<sup>†</sup>, Rocky Duan<sup>†</sup> |
174 | 195 | <br> |
175 | | - <a href="https://arxiv.org/abs/2511.02832" target="_blank"><i class="far fa-file"></i> paper</a>   |
176 | | - <a href="https://yanjieze.com/TWIST2/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
177 | | - <a href="https://github.com/amazon-far/TWIST2" target="_blank"><i class="fas fa-code"></i> code</a> |
178 | | - <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: TWIST2 is a portable whole-body humanoid teleoperation system that enables scalable data collection. |
| 196 | + <a href="https://www.arxiv.org/abs/2510.05070" target="_blank"><i class="far fa-file"></i> paper</a>   |
| 197 | + <a href="https://resmimic.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> |
| 198 | + <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A two-stage humanoid skill learning framework that learns a residual policy on top of a pre-trained general motion tracking policy. |
179 | 199 | </td> |
180 | 200 | </tr> |
181 | 201 | </table> |
|
186 | 206 | <tr> |
187 | 207 | <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
188 | 208 | <div class="image-container"> |
189 | | - <img src='publications/2025_ResMimic.gif' width="90%"> |
| 209 | + <img src='publications/2025_HDMI.gif' width="90%"> |
190 | 210 | </div> |
191 | 211 | </td> |
192 | 212 | <td style="width:65%; vertical-align:middle"> |
193 | | - <papertitle>ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning</papertitle> |
| 213 | + <papertitle>HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos</papertitle> |
194 | 214 | <br> |
195 | | - Siheng Zhao, Yanjie Ze, Yue Wang, C. Karen Liu<sup>†</sup>, Pieter Abbeel<sup>†</sup>, Guanya Shi<sup>†</sup>, Rocky Duan<sup>†</sup> |
| 215 | + Haoyang Weng, Yitang Li, Nikhil Sobanbabu, Zihan Wang, Zhengyi Luo, Tairan He, Deva Ramanan, Guanya Shi |
196 | 216 | <br> |
197 | | - <a href="https://www.arxiv.org/abs/2510.05070" target="_blank"><i class="far fa-file"></i> paper</a>   |
198 | | - <a href="https://resmimic.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> |
199 | | - <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: A two-stage humanoid skill learning framework that learns a residual policy on top of a pre-trained general motion tracking policy. |
| 217 | + <a href="https://arxiv.org/abs/2509.16757" target="_blank"><i class="far fa-file"></i> paper</a>   |
| 218 | + <a href="https://hdmi-humanoid.github.io/#/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
| 219 | + <a href="https://github.com/LeCAR-Lab/HDMI" target="_blank"><i class="fas fa-code"></i> code</a> |
| 220 | + <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: From human videos, HDMI learns robust humanoid loco-manipulation skills (e.g., opening a door continuously for 67 times). |
200 | 221 | </td> |
201 | 222 | </tr> |
202 | 223 | </table> |
203 | 224 |
|
204 | 225 | <br> |
205 | 226 |
|
| 227 | + <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
| 228 | + <tr> |
| 229 | + <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
| 230 | + <div class="image-container"> |
| 231 | + <img src='publications/2025_LeVERB.gif' width="90%"> |
| 232 | + </div> |
| 233 | + </td> |
| 234 | + <td style="width:65%; vertical-align:middle"> |
| 235 | + <papertitle>LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction</papertitle> |
| 236 | + <br> |
| 237 | + Haoru Xue<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Dantong Niu<sup>*</sup>, Qiayuan Liao<sup>*</sup>, Thomas Kragerud, Jan Tommy Gravdahl, Xue Bin Peng, Guanya Shi, Trevor Darrell, Koushil Screenath, Shankar Sastry |
| 238 | + <br> |
| 239 | + <a href="https://arxiv.org/abs/2506.13751" target="_blank"><i class="far fa-file"></i> paper</a>   |
| 240 | + <a href="https://ember-lab-berkeley.github.io/LeVERB-Website/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
| 241 | + <a href="https://huggingface.co/datasets/ember-lab-berkeley/LeVERB-Bench-Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> |
| 242 | + <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: LeVERB is a whole-body humanoid VLA, via learning an expressive and executable latent action vocabulary to bridge system 1 and system 2. |
| 243 | + </td> |
| 244 | + </tr> |
| 245 | + </table> |
| 246 | + |
| 247 | + <br> |
| 248 | + <br> |
| 249 | + |
| 250 | + <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
| 251 | + <tr> |
| 252 | + <td width="100%" valign="middle"> |
| 253 | + <heading>2026</heading> |
| 254 | + </td> |
| 255 | + </tr> |
| 256 | + </table> |
| 257 | + |
206 | 258 | <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
207 | 259 | <tr style="background-color: var(--highlight-color)"> |
208 | 260 | <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
|
215 | 267 | <br> |
216 | 268 | Harsh Gupta, Xiaofeng Guo, Huy Ha, Chuer Pan, Muqing Cao, Dongjae Lee, Sebastian Sherer, Shuran Song, Guanya Shi |
217 | 269 | <br> |
| 270 | + <em>International Conference on Robotics and Automation (ICRA)</em>, 2026 |
| 271 | + <br> |
218 | 272 | <a href="https://arxiv.org/abs/2510.02614" target="_blank"><i class="far fa-file"></i> paper</a>   |
219 | 273 | <a href="https://umi-on-air.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> |
220 | 274 | <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: EADP steers UMI's embodiment-agnostic diffusion policy using the gradient of the low-level controller's tracking cost for cross-embodiment. |
|
236 | 290 | <br> |
237 | 291 | Lars Ankile, Zhenyu Jiang, Rocky Duan, Guanya Shi, Pieter Abbeel, Anusha Nagabandi |
238 | 292 | <br> |
| 293 | + <em>International Conference on Robotics and Automation (ICRA)</em>, 2026 |
| 294 | + <br> |
239 | 295 | <a href="https://arxiv.org/abs/2509.19301" target="_blank"><i class="far fa-file"></i> paper</a>   |
240 | 296 | <a href="https://residual-offpolicy-rl.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a> |
241 | 297 | <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: ResFiT is a sample-efficient residual RL method that performs real-world RL directly on a wheeled humanoid with two dex hands. |
|
257 | 313 | <br> |
258 | 314 | Lujie Yang<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Zhen Wu<sup>*</sup>, Angjoo Kanazawa<sup>†</sup>, Pieter Abbeel<sup>†</sup>, Carmelo Sferrazza<sup>†</sup>, C. Karen Liu<sup>†</sup>, Rocky Duan<sup>†</sup>, Guanya Shi<sup>†</sup> |
259 | 315 | <br> |
| 316 | + <em>International Conference on Robotics and Automation (ICRA)</em>, 2026 |
| 317 | + <br> |
260 | 318 | <a href="https://arxiv.org/abs/2509.26633" target="_blank"><i class="far fa-file"></i> paper</a>   |
261 | 319 | <a href="https://omniretarget.github.io/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
262 | 320 | <a href="https://huggingface.co/datasets/omniretarget/OmniRetarget_Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> |
|
271 | 329 | <tr> |
272 | 330 | <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
273 | 331 | <div class="image-container"> |
274 | | - <img src='publications/2025_HDMI.gif' width="90%"> |
| 332 | + <img src='publications/2025_TWIST2.gif' width="90%"> |
275 | 333 | </div> |
276 | 334 | </td> |
277 | 335 | <td style="width:65%; vertical-align:middle"> |
278 | | - <papertitle>HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos</papertitle> |
279 | | - <br> |
280 | | - Haoyang Weng, Yitang Li, Nikhil Sobanbabu, Zihan Wang, Zhengyi Luo, Tairan He, Deva Ramanan, Guanya Shi |
| 336 | + <papertitle>TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System</papertitle> |
281 | 337 | <br> |
282 | | - <a href="https://arxiv.org/abs/2509.16757" target="_blank"><i class="far fa-file"></i> paper</a>   |
283 | | - <a href="https://hdmi-humanoid.github.io/#/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
284 | | - <a href="https://github.com/LeCAR-Lab/HDMI" target="_blank"><i class="fas fa-code"></i> code</a> |
285 | | - <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: From human videos, HDMI learns robust humanoid loco-manipulation skills (e.g., opening a door continuously for 67 times). |
286 | | - </td> |
287 | | - </tr> |
288 | | - </table> |
289 | | - |
290 | | - <br> |
291 | | - |
292 | | - <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
293 | | - <tr> |
294 | | - <td style="width:35%; vertical-align:middle; padding-right: 20px;"> |
295 | | - <div class="image-container"> |
296 | | - <img src='publications/2025_LeVERB.gif' width="90%"> |
297 | | - </div> |
298 | | - </td> |
299 | | - <td style="width:65%; vertical-align:middle"> |
300 | | - <papertitle>LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction</papertitle> |
| 338 | + Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa<sup>†</sup>, Rocky Duan<sup>†</sup>, Pieter Abbeel<sup>†</sup>, Guanya Shi<sup>†</sup>, Jiajun Wu<sup>†</sup>, C. Karen Liu<sup>†</sup> |
301 | 339 | <br> |
302 | | - Haoru Xue<sup>*</sup>, Xiaoyu Huang<sup>*</sup>, Dantong Niu<sup>*</sup>, Qiayuan Liao<sup>*</sup>, Thomas Kragerud, Jan Tommy Gravdahl, Xue Bin Peng, Guanya Shi, Trevor Darrell, Koushil Screenath, Shankar Sastry |
| 340 | + <em>International Conference on Robotics and Automation (ICRA)</em>, 2026 |
303 | 341 | <br> |
304 | | - <a href="https://arxiv.org/abs/2506.13751" target="_blank"><i class="far fa-file"></i> paper</a>   |
305 | | - <a href="https://ember-lab-berkeley.github.io/LeVERB-Website/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
306 | | - <a href="https://huggingface.co/datasets/ember-lab-berkeley/LeVERB-Bench-Dataset" target="_blank"><i class="fas fa-database"></i> dataset</a> |
307 | | - <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: LeVERB is a whole-body humanoid VLA, via learning an expressive and executable latent action vocabulary to bridge system 1 and system 2. |
| 342 | + <a href="https://arxiv.org/abs/2511.02832" target="_blank"><i class="far fa-file"></i> paper</a>   |
| 343 | + <a href="https://yanjieze.com/TWIST2/" target="_blank"><i class="fas fa-globe"></i> website</a>   |
| 344 | + <a href="https://github.com/amazon-far/TWIST2" target="_blank"><i class="fas fa-code"></i> code</a> |
| 345 | + <p style="margin-top: 5px"><i class="fas fa-comment-dots"></i> TL;DR: TWIST2 is a portable whole-body humanoid teleoperation system that enables scalable data collection. |
308 | 346 | </td> |
309 | 347 | </tr> |
310 | 348 | </table> |
311 | 349 |
|
312 | | - <br> |
313 | | - <br> |
314 | | - |
315 | | - <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
316 | | - <tr> |
317 | | - <td width="100%" valign="middle"> |
318 | | - <heading>2026</heading> |
319 | | - </td> |
320 | | - </tr> |
321 | | - </table> |
| 350 | + <br> |
322 | 351 |
|
323 | 352 | <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
324 | 353 | <tr style="background-color: var(--highlight-color)"> |
|
1693 | 1722 | </tr> |
1694 | 1723 | </table> |
1695 | 1724 |
|
1696 | | - <br> |
| 1725 | +<!-- <br> |
1697 | 1726 | <br> |
1698 | 1727 |
|
1699 | 1728 | <table width="880" border="0" align="center" cellspacing="0" cellpadding="0"> |
|
1721 | 1750 | </p> |
1722 | 1751 | </td> |
1723 | 1752 | </tr> |
1724 | | - </table> |
| 1753 | + </table> --> |
1725 | 1754 |
|
1726 | 1755 | <br> |
1727 | 1756 | <br> |
|
0 commit comments