pick_blockenvironment with aHoldinggoal for pick-only planning (#18)tests/test_pick_only_goal.pycovering planning with aHolding-only goal (#18)tests/test_task_planning.pycovering BFS goal-literal validation (#14)
- BFS now rejects goal atoms whose non-fabricable literals (e.g.
movable,surface) do not appear in the initial state, raisingValueErrorinstead of expanding the frontier forever. Fixes a hang triggered by hallucinated perception objects in the goal (#14) - Pick-only goals no longer crash in cost-function setup: object pairs and movable-to-world collision masks now handle objects that are picked but never placed (#18)
- Consolidated fresh-symbol minting in
get_valid_ground_operatorsbehind a single_FABRICABLE_TYPE_PREFIXEStable so the sampler and the BFS goal validator cannot drift out of sync (#14)
All changes in this release are from #11 (Warp sphere overlap + FK Pose optimizations).
- NVIDIA Warp kernel for
sphere_to_sphere_overlapwith fused cost + analytical gradients, replacing the PyTorch pairwise implementation - Concatenated
robot_to_movableskernel launch — single call over all movable spheres instead of per-object launches torch.profiler.record_functionannotations through the optimization loop, cost function, and rollout--torch-profile,--torch-profile-output,--coll_n_spheres,--placement_shrink_dist, and--prop_satisfying_breakCLI flags oncutamp-demoblocks_5environment for benchmarkingtests/test_sphere_overlap.pycorrectness + gradient tests for the Warp kerneldocs/profiling-analysis.mdcovering how to profile cuTAMP and where time goes today
- Rollout stores
ee_positionandee_quaterniondirectly from cuRobo FK;kinematic_costsno longer round-trips throughPose.from_matrix - Removed per-step
torch.cuda.synchronize()in the optimization loop that was forcing CPU-GPU pipeline stalls
End-to-end optimization loop wall time, 100 steps, 512 particles, RTX 3090, median of 3 runs. Speedup scales with the size of the movable-sphere pairwise tensor (more objects × more spheres/object → larger win):
| Env | Before (0.0.3) | After (0.0.4) | Speedup |
|---|---|---|---|
tetris_3 (3 blocks, ~6 sph/obj) |
1.46s | 1.36s | 1.07x |
blocks (4 blocks, ~50 sph/obj) |
3.43s | 1.68s | 2.04x |
blocks_5 (5 blocks, 50 sph/obj) |
4.98s | 1.87s | 2.66x |
max_motion_refine_attemptsconfig option to cap the number of satisfying particles tried during motion refinement per skeleton (#9)- Retry next plan skeleton when motion refinement fails instead of breaking out of the loop (#9)
- Test environment and pytest for movable-to-world initial collision fix (#10)
- Movable-to-world collision now masks initial timesteps per object, matching the movable-to-movable approach, so objects with perception noise at their initial pose are not incorrectly penalized (#10)
- Vectorized movable-to-world collision into a single batched
collision_fncall with cached mask for better performance (#10) break_on_satisfyingno longer exits the skeleton loop when motion planning is enabled but all particles fail (#9)
- Expose
__version__in__init__.pyviaimportlib.metadata
- Disable CPU pose update to avoid mutating object pose in
TAMPEnvironment
- Initial release of cuTAMP from NVLabs — GPU-parallelized TAMP solver with core algorithm, cost functions, rollout, samplers, task planning search, and environment definitions (book shelf, stick button, tetris)
- Robot support for Franka Panda, FR3, and UR5e with Robotiq 2F-85/140 grippers
- TiPToP integration: extended algorithm and motion solver, added Franka+Robotiq robot config, OBB collision utilities, new environment assets
- Return failure reason from planner for better diagnostics
- Log git status in experiment logger