Update Betweenness Centrality normalization #4974

ChuckHastings · 2025-03-14T22:34:42Z

Betweenness Centrality normalization is not quite right if you specify not including endpoints and use approximate betweenness.

This PR temporarily disables some of the python tests that compare results with networkx, since the networkx to update the normalization scores is not yet merged. Once networkx/networkx#7908 is merged we should be able to create another PR to enable those tests. Each of the disabled tests is skipped with a link to that PR as the reason.

Closes #4941

copy-pr-bot · 2025-03-14T22:34:46Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…irected graph

python/cugraph/cugraph/tests/centrality/test_betweenness_centrality.py

…ality.py Co-authored-by: Erik Welch <[email protected]>

seunghwak

Looks good to me but I have some questions about the logic to set scale_factor.

seunghwak · 2025-03-20T17:00:18Z

cpp/src/centrality/betweenness_centrality_impl.cuh

+      }
+    } else if (graph_view.number_of_vertices() > 2) {
+      scale_factor = static_cast<weight_t>(
+        std::min(static_cast<vertex_t>(num_sources), graph_view.number_of_vertices() - 1) *


No need to subtract 1 from num_sources? (i.e. static_cast<vertex_t>(num_sources - 1)?)

I assume num_sources == graph_view.number_of_vertices() for full BC. It looks a bit weird to subtract 1 just from graph_view.number_of_vertices().

We had some complex gyrations around the formulas.

There are a couple of things being accounted for in the scaling factor. In the normalization path, we're trying to divide by the maximum number of times a vertex could appear in the shortest paths. For the full graph, since we're not including endpoints, this is (n-1) * (n-2) where n is the number of vertices in the graph. This would occur for a vertex v that has an input edge from every vertex in the graph. The n-1 factor counts every vertex other than v (when we start at v we won't travel back to v and we're not counting the endpoint). and the n-2 factor is the maximum number of paths that could travel through v.

For approximate betweenness, we're only traveling through num_sources samples. So the maximum value would be num_sources * n-2. This would occur in any combination of the above described graph where the randomly selected sources did not include the vertex v.

I agree it looks odd.

seunghwak · 2025-03-20T17:14:40Z

cpp/src/centrality/betweenness_centrality_impl.cuh

+    scale_factor = (graph_view.is_symmetric() ? weight_t{2} : weight_t{1}) *
+                   static_cast<weight_t>(num_sources) /
+                   (include_endpoints ? static_cast<weight_t>(graph_view.number_of_vertices())
+                                      : static_cast<weight_t>(graph_view.number_of_vertices() - 1));


We don't check vertices.size() (or sum of vertices.size() in multi-GPU) > 0. So, it is technically possible to pass empty seed vertices, and in this case, num_sources = 0 && graph_view.number_of_verties() = 1 is possible; then, scale_factor can become 0 leading to divide by 0.

Good catch. That check exists in the above if statements and not this one. I will add that.

Just pushed an update

ChuckHastings · 2025-03-20T21:34:31Z

/merge

update BC normalization

03f142e

github-actions bot added the cuGraph label Mar 14, 2025

ChuckHastings self-assigned this Mar 14, 2025

ChuckHastings added bug Something isn't working non-breaking Non-breaking change labels Mar 14, 2025

ChuckHastings marked this pull request as ready for review March 14, 2025 22:35

ChuckHastings requested a review from a team as a code owner March 14, 2025 22:35

update BC to improve normalizationm computation

56b53f4

ChuckHastings requested a review from a team as a code owner March 18, 2025 20:00

github-actions bot added the python label Mar 18, 2025

ChuckHastings added 2 commits March 19, 2025 09:27

after discussion, add back the halving of unnormalized results on und…

4c9b203

…irected graph

Merge branch 'branch-25.04' into fix_betweenness_normalization

52c7962

eriknw mentioned this pull request Mar 19, 2025

Fix bc scale with k endpoints networkx/networkx#7908

Merged

eriknw reviewed Mar 19, 2025

View reviewed changes

python/cugraph/cugraph/tests/centrality/test_betweenness_centrality.py Outdated Show resolved Hide resolved

Update python/cugraph/cugraph/tests/centrality/test_betweenness_centr…

4592042

…ality.py Co-authored-by: Erik Welch <[email protected]>

eriknw approved these changes Mar 20, 2025

View reviewed changes

jnke2016 approved these changes Mar 20, 2025

View reviewed changes

seunghwak reviewed Mar 20, 2025

View reviewed changes

ChuckHastings added 2 commits March 20, 2025 11:06

Merge branch 'branch-25.04' into fix_betweenness_normalization

2df0021

handle case where we might get 0

e41ce24

seunghwak approved these changes Mar 20, 2025

View reviewed changes

rapids-bot bot merged commit 6ef7d0b into rapidsai:branch-25.04 Mar 20, 2025
82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Betweenness Centrality normalization #4974

Update Betweenness Centrality normalization #4974

ChuckHastings commented Mar 14, 2025 •

edited

Loading

copy-pr-bot bot commented Mar 14, 2025

seunghwak left a comment

seunghwak Mar 20, 2025

ChuckHastings Mar 20, 2025

seunghwak Mar 20, 2025

ChuckHastings Mar 20, 2025

ChuckHastings Mar 20, 2025

ChuckHastings commented Mar 20, 2025

Update Betweenness Centrality normalization #4974

Update Betweenness Centrality normalization #4974

Conversation

ChuckHastings commented Mar 14, 2025 • edited Loading

copy-pr-bot bot commented Mar 14, 2025

seunghwak left a comment

Choose a reason for hiding this comment

seunghwak Mar 20, 2025

Choose a reason for hiding this comment

ChuckHastings Mar 20, 2025

Choose a reason for hiding this comment

seunghwak Mar 20, 2025

Choose a reason for hiding this comment

ChuckHastings Mar 20, 2025

Choose a reason for hiding this comment

ChuckHastings Mar 20, 2025

Choose a reason for hiding this comment

ChuckHastings commented Mar 20, 2025

ChuckHastings commented Mar 14, 2025 •

edited

Loading