Skip to content

Conversation

@zbbsdsb
Copy link

@zbbsdsb zbbsdsb commented Dec 22, 2025

🚀 Summary

This PR significantly refactors utils/graphics_utils.py to modernize the core geometry utilities. The changes introduce GPU-accelerated operations, an object-oriented API for better data encapsulation, and full type hinting, while maintaining full backward compatibility. This refactoring aims to improve performance, code clarity, and developer experience for 3D Gaussian Splatting and related tasks.

✨ Motivation

The original utility functions, while functional, were procedural and not optimized for batch operations or modern PyTorch workflows. This refactor addresses these limitations by:

  • Providing a more intuitive and safe interface through dedicated classes.
  • Enabling seamless GPU acceleration for large-scale point clouds and camera operations.
  • Adding utility functions commonly needed in 3D vision pipelines (normalization, projection, etc.).
  • Ensuring the existing codebase is not broken by retaining all legacy function signatures.

📋 Key Changes

1. New Data Structures (Core Enhancement)

  • PointCloud class: A dataclass wrapper for (points, colors, normals) that handles device management, data validation, and transformations. It supports easy .to(device) movement and .numpy() conversion.
  • CameraParameters class: Encapsulates intrinsic/extrinsic camera parameters. It automatically precomputes derived matrices (projection, view) and provides properties like camera_center.

2. Optimized & New Core Functions

  • transform_points: Enhanced to support batched transformation matrices and improved numerical stability.
  • get_world_to_view_matrix / get_view_to_world_matrix: More logically named, torch-native versions of the original functions.
  • get_projection_matrix: Unchanged in logic, but now can target a specific device.
  • New Utilities: Added normalize_points, compute_bounding_box, project_points_to_image, and compute_point_density for common 3D tasks.

3. Code Quality & Developer Experience

  • Full Type Hints: All functions and methods have Python type annotations.
  • Comprehensive Docstrings: All public APIs include detailed documentation following a standard format.
  • Batch Operations: Key functions support batched inputs for better performance.
  • Explicit Naming: New functions use snake_case for consistency with PyTorch conventions.

4. Backward Compatibility

  • All original functions remain: geom_transform_points, getWorld2View, getWorld2View2, getProjectionMatrix, fov2focal, focal2fov, and the BasicPointCloud NamedTuple are preserved with identical signatures. They act as wrappers to the new implementations where applicable.
  • Zero-breaking changes: Existing code importing and using the old API will continue to work without modification.

🧪 Testing

To ensure robustness, the following tests were performed:

  1. Numerical Equivalence: Verified that outputs from new functions (e.g., transform_points) match the original numpy-based functions within a tolerance of 1e-6.
  2. Device Consistency: Tested that all classes and functions behave correctly on both cpu and cuda (if available) devices.
  3. Backward Compatibility: Confirmed that calling legacy functions produces identical results to the previous implementation.
  4. Example Script: The if __name__ == "__main__": block in the file provides a basic execution test.

📝 Checklist

  • My code follows the existing code style of the project (predominantly PyTorch style).
  • I have added comprehensive docstrings to all new public functions and classes.
  • I have added type hints to all functions and significant variables.
  • My changes are focused on a single file (graphics_utils.py) and its logical improvements.
  • I have verified that my changes do not break any existing functionality.

🔮 Future Work (Optional Notes)

This refactor lays the groundwork for several potential future improvements:

  • Integration of these utilities into a more formalized rendering or camera module.
  • Further optimization of compute_point_density for very large point clouds.
  • Addition of more camera models (e.g., fisheye) within the CameraParameters framework.

❓ Additional Notes

  • The PointCloud class uses torch.Tensor internally. Users with numpy arrays can easily use the PointCloud.from_numpy() constructor.
  • Performance gains are most noticeable when processing large batches of points or cameras on GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant