Switch SolverFunction to use ResizeableMatrix for temporary storage #976

cdtwigg · 2026-01-21T23:28:11Z

Summary:
Changes the tJacobian_ and tResidual_ member variables in SolverFunctionT from
MatrixX/VectorX to ResizeableMatrix. This provides more efficient memory management:

ResizeableMatrix uses std::vector internally which doesn't reallocate when size
decreases, only when capacity is exceeded
Uses resizeAndSetZero() which combines resize and zeroing in one operation
Provides aligned memory access via Eigen::Aligned16 maps

Benchmarks show ~1% improvement for double-precision workloads on the getJtJR()
path. Float performance is unchanged.

Also added ResizeableMatrix benchmarks to jtj_interface_benchmark.cpp for
comparison testing.

Reviewed By: jeongseok-meta

Differential Revision: D90813663

meta-codesync · 2026-01-21T23:28:42Z

@cdtwigg has exported this pull request. If you are a Meta employee, you can view the originating Diff in D90813663.

…976) Summary: Pull Request resolved: #976 Changes the tJacobian_ and tResidual_ member variables in SolverFunctionT from MatrixX/VectorX to ResizeableMatrix. This provides more efficient memory management: - ResizeableMatrix uses std::vector internally which doesn't reallocate when size decreases, only when capacity is exceeded - Uses resizeAndSetZero() which combines resize and zeroing in one operation - Provides aligned memory access via Eigen::Aligned16 maps Benchmarks show ~1% improvement for double-precision workloads on the getJtJR() path. Float performance is unchanged. Also added ResizeableMatrix benchmarks to jtj_interface_benchmark.cpp for comparison testing. Reviewed By: jeongseok-meta Differential Revision: D90813663

Summary: Introduced a new block-wise interface for computing Jacobians that allows solvers to access individual Jacobian blocks without requiring the full Jacobian matrix to be materialized in memory. Previously, getJtJR() computed JtJ block-by-block internally, which limited optimization opportunities in the solver. The new interface exposes initializeJacobianComputation(), getJacobianBlockCount(), getJacobianBlockSize(), and computeJacobianBlock() methods that enable the solver to control how blocks are processed and combined. Both getJacobian() and getJtJR() are now implemented in the parent SolverFunctionT class using the block interface, removing code duplication from SkeletonSolverFunctionT. The block interface becomes the primitive, with the full-matrix methods being derived implementations. This provides flexibility for solvers to implement custom optimization strategies while maintaining backward compatibility. Differential Revision: D89145782

Summary: Split the GaussNewtonSolver into two separate classes: GaussNewtonSolverT (dense) and SparseGaussNewtonSolverT (sparse). Previously, GaussNewtonSolver automatically switched between dense and sparse implementations based on sparseMatrixThreshold, creating code complexity and mixing two distinct implementations in one class. Changes: - Created SparseGaussNewtonSolverT in gauss_newton_solver_sparse.{h,cpp} - Moved all sparse-specific code (sparse matrix operations, SimplicialLLT, etc.) to sparse solver - Removed sparse-specific options from GaussNewtonSolverOptions (useBlockJtJ, directSparseJtJ, sparseMatrixThreshold) - Updated GaussNewtonSolverT to be dense-only, removed doIterationSparse() method - Updated all tests to remove references to removed sparse options - Updated transform_pose.cpp and other users to not set removed options - Added SparseGaussNewtonSolverOptions with sparse-specific options Benefits: - Clearer separation of concerns - each solver focuses on one implementation - Simpler code - no runtime switching between dense/sparse - Easier to maintain and test each solver independently - Users explicitly choose which solver to use based on their needs Differential Revision: D89145780

Summary: The base class SolverT::solve() already initializes lastError_ to max before calling initializeSolver(), so the derived solvers don't need to do it again. Differential Revision: D91158885

Summary: The alpha_ member was intended to allow per-iteration adjustment of the regularization parameter, but it was never actually modified during iterations. It was simply copied from regularization_ at initialization and used unchanged. This removes the unnecessary indirection and uses regularization_ directly. Differential Revision: D91158884

Summary: Refactors GaussNewtonSolverQRT to use the generic SolverFunctionT interface with the new blockwise Jacobian API instead of directly depending on SkeletonSolverFunctionT. Key changes: - Constructor now takes SolverFunctionT* instead of SkeletonSolverFunctionT* - Removed internal SkeletonState and MeshState management (now handled by SolverFunction via initializeJacobianComputation) - Replaced direct iteration over error functions with the blockwise interface: getJacobianBlockCount(), getJacobianBlockSize(), computeJacobianBlock() - Added vec() accessor to ResizeableMatrix for single-column matrix access This makes the QR solver more generic and decoupled from skeleton-specific code, allowing it to work with any SolverFunction that implements the blockwise interface. Differential Revision: D90812388

…port Summary: Two related optimizations to GaussNewtonSolver: **Chunked Jacobian Accumulation:** - Uses a single large JᵀJ matrix multiplication per chunk instead of many small rankUpdate() calls - Adds targetRowsPerChunk option to control memory vs speed trade-off: - SIZE_MAX (default): Process all rows in one chunk (fastest, most memory) - 0: Process each error function block separately (slowest, minimum memory) - Intermediate values: Group blocks via bin-packing up to target row count - Uses ResizeableMatrix for Jacobian/residual storage to avoid reallocations **BlockJtJ Support:** - Adds useBlockJtJ option with proper handling of non-contiguous parameter subsets - Simplified parameter mapping to a simple vector<int> enabledParameters_ - Added computeJtJFromBlockJtJ() with in-place row/column compaction, avoiding temporary allocation - In-place approach shows 6-17% speedup over temporary-allocation approach Performance results (ManyBlocks benchmark with 26 error functions, 10 iterations): | Solver | float (μs) | double (μs) | |---------------------|------------|-------------| | GaussNewton | 244 | 257 | | SubsetGaussNewton | 243 | 256 | GaussNewton now matches SubsetGaussNewton performance (within 1%). Differential Revision: D89145781

…976) Summary: Pull Request resolved: #976 Changes the tJacobian_ and tResidual_ member variables in SolverFunctionT from MatrixX/VectorX to ResizeableMatrix. This provides more efficient memory management: - ResizeableMatrix uses std::vector internally which doesn't reallocate when size decreases, only when capacity is exceeded - Uses resizeAndSetZero() which combines resize and zeroing in one operation - Provides aligned memory access via Eigen::Aligned16 maps Benchmarks show ~1% improvement for double-precision workloads on the getJtJR() path. Float performance is unchanged. Also added ResizeableMatrix benchmarks to jtj_interface_benchmark.cpp for comparison testing. Reviewed By: jeongseok-meta Differential Revision: D90813663

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 21, 2026

meta-codesync bot added fb-exported meta-exported labels Jan 21, 2026

cdtwigg force-pushed the export-D90813663 branch from 2a5f134 to 44f0293 Compare January 21, 2026 23:57

cdtwigg force-pushed the export-D90813663 branch from 44f0293 to 1f440d1 Compare January 29, 2026 18:41

cdtwigg added 7 commits January 29, 2026 10:52

Remove redundant lastError_ initialization from Gauss-Newton solvers

5403c6c

Summary: The base class SolverT::solve() already initializes lastError_ to max before calling initializeSolver(), so the derived solvers don't need to do it again. Differential Revision: D91158885

cdtwigg force-pushed the export-D90813663 branch from 1f440d1 to e65cd68 Compare January 29, 2026 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch SolverFunction to use ResizeableMatrix for temporary storage #976

Switch SolverFunction to use ResizeableMatrix for temporary storage #976

Uh oh!

cdtwigg commented Jan 21, 2026

Uh oh!

meta-codesync bot commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Switch SolverFunction to use ResizeableMatrix for temporary storage #976

Are you sure you want to change the base?

Switch SolverFunction to use ResizeableMatrix for temporary storage #976

Uh oh!

Conversation

cdtwigg commented Jan 21, 2026

Uh oh!

meta-codesync bot commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants