Skip to content

Commit ea2efaf

Browse files
committed
perf: Enable TBB parallelization unconditionally in HessianFactor
Remove conditional check for GTSAM_TBB_BOUNDED_MEMORY_GROWTH_FLAG to always enable TBB parallelization when GTSAM_USE_TBB is defined. Improves optimizer performance on my test case with 12 cores by ~27% (117.2s -> 85.7s) by reducing iteration times from ~5-6s to ~3s. before: ``` ./build_RelWithDebInfo/simple_gtsam_deserialize simple_gtsam_deserialize2.cpp Reading values file took 0.0108 seconds Reading graph file took 0.1702 seconds Processing factor lines took 2.7537 seconds Processing values lines took 0.1056 seconds Setting up optimizer took 9.1390 seconds Initial error: 83467.7055, values: 65796 iter cost cost_change lambda success iter_time 0 inf 0.00 0.00 0 1.47 iter cost cost_change lambda success iter_time 0 80421.88 3045.82 0.00 1 6.98 1 inf 0.00 0.00 0 0.89 1 80163.07 258.81 0.00 1 6.23 2 inf 0.00 0.00 0 1.07 2 80113.89 49.18 0.00 1 6.51 3 inf 0.00 0.00 0 1.09 3 80096.39 17.50 0.00 1 6.75 4 inf 0.00 0.00 0 1.10 4 80089.79 6.60 0.00 1 5.43 5 inf 0.00 0.00 0 0.86 5 80086.41 3.38 0.00 1 5.08 6 inf 0.00 0.00 0 0.89 6 80084.44 1.98 0.00 1 5.02 7 inf 0.00 0.00 0 0.88 7 80083.05 1.39 0.00 1 5.06 8 inf 0.00 0.00 0 0.88 8 80081.84 1.21 0.00 1 5.31 9 inf 0.00 0.00 0 0.89 9 80080.52 1.32 0.00 1 5.14 10 inf 0.00 0.00 0 0.90 10 80078.92 1.60 0.00 1 5.20 11 inf 0.00 0.00 0 0.87 11 80076.99 1.93 0.00 1 5.27 12 inf 0.00 0.00 0 0.90 12 80074.94 2.05 0.00 1 5.25 13 inf 0.00 0.00 0 0.88 13 80073.22 1.72 0.00 1 5.23 14 inf 0.00 0.00 0 0.88 14 80071.87 1.34 0.00 1 5.08 15 inf 0.00 0.00 0 0.87 15 80071.04 0.83 0.00 1 5.09 16 inf 0.00 0.00 0 0.90 16 80070.76 0.28 0.00 1 5.22 Running gtsam optimizer took 117.2392 seconds ``` After: ``` ./build_RelWithDebInfo/simple_gtsam_deserialize simple_gtsam_deserialize2.cpp Reading values file took 0.0124 seconds Reading graph file took 0.1912 seconds Processing factor lines took 3.0148 seconds Processing values lines took 0.1227 seconds Setting up optimizer took 10.6574 seconds Initial error: 83467.7055, values: 65796 iter cost cost_change lambda success iter_time 0 inf 0.00 0.00 0 1.96 iter cost cost_change lambda success iter_time 0 80421.88 3045.82 0.00 1 5.44 1 inf 0.00 0.00 0 0.97 1 80163.07 258.81 0.00 1 3.14 2 inf 0.00 0.00 0 0.92 2 80113.89 49.18 0.00 1 2.98 3 inf 0.00 0.00 0 0.92 3 80096.39 17.50 0.00 1 3.09 4 inf 0.00 0.00 0 0.91 4 80089.79 6.60 0.00 1 2.98 5 inf 0.00 0.00 0 0.91 5 80086.41 3.38 0.00 1 2.94 6 inf 0.00 0.00 0 0.90 6 80084.44 1.98 0.00 1 3.00 7 inf 0.00 0.00 0 0.90 7 80083.05 1.39 0.00 1 2.99 8 inf 0.00 0.00 0 0.91 8 80081.84 1.21 0.00 1 3.09 9 inf 0.00 0.00 0 0.90 9 80080.52 1.32 0.00 1 3.06 10 inf 0.00 0.00 0 1.12 10 80078.92 1.60 0.00 1 3.69 11 inf 0.00 0.00 0 1.27 11 80076.99 1.93 0.00 1 3.72 12 inf 0.00 0.00 0 1.19 12 80074.94 2.05 0.00 1 3.94 13 inf 0.00 0.00 0 1.19 13 80073.22 1.72 0.00 1 3.81 14 inf 0.00 0.00 0 1.15 14 80071.87 1.34 0.00 1 3.60 15 inf 0.00 0.00 0 1.15 15 80071.04 0.83 0.00 1 3.86 16 inf 0.00 0.00 0 1.15 16 80070.76 0.28 0.00 1 3.85 Running gtsam optimizer took 85.6852 seconds ```
1 parent 1e72eb1 commit ea2efaf

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

gtsam/linear/HessianFactor.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ HessianFactor::HessianFactor(const GaussianFactorGraph& factors,
258258
// already running in parallel (e.g., when constructing multiple factors
259259
// concurrently), so there's no need to parallelize the inner
260260
// updateHessian loop here as well.
261-
#if defined(GTSAM_USE_TBB) && defined(GTSAM_TBB_BOUNDED_MEMORY_GROWTH_FLAG)
261+
#if defined(GTSAM_USE_TBB)
262262
constexpr DenseIndex kParallelThresholdHeuristic = 50;
263263
if (info_.rows() > kParallelThresholdHeuristic) {
264264
gttic(updateHessian_TBB);

0 commit comments

Comments
 (0)