Skip to content

[Simplex_tree] Fast stars access #877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Jul 7, 2023

Conversation

VincentRouvreau
Copy link
Contributor

These development was taken from @ClementMaria on #401 and bugs were fixed.
Tested with sanitizer and there is no memory leaks.
As it is a development in progress I did not perform benchmarks yet.

Copy link
Member

@mglisse mglisse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, for some reason github shows all these old comments as "pending", maybe I forgot to "finish" them...

@VincentRouvreau VincentRouvreau marked this pull request as ready for review June 30, 2023 15:09
&Hooks_simplex_base_link_nodes::list_max_vertex_hook_>
List_member_hook_t;
// auto_unlink in Member_hook_t is incompatible with constant time size
typedef boost::intrusive::list<Hooks_simplex_base_link_nodes, List_member_hook_t,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to make this a boost::intrusive::list<Dit_value_t, ...>? Then we wouldn't even need simplex_handle_from_node.

Comment on lines +94 to +95
typedef boost::filter_iterator<is_coface, typename SimplexTree::List_max_vertex::iterator>
Filtered_cofaces_simplex_iterator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this whole class basically a filtered_range? Wouldn't using filtered_range save a lot of code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a boost::filtered_range<decltype(is_coface()), typename SimplexTree::Hooks_simplex_base> ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filtered_range<is_coface, List_max_vertex>?

return (&(sh_->second) == &(other.sh_->second));
}

Simplex_handle const& dereference() const { return sh_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't much point returning a const reference to a pointer. I guess it will always be inlined so it doesn't matter.

Siblings* sib_; //
// use a bfs search to avoid calling sib_->members().find(.)
// A static_vector is quite faster than a queue
// 100 seems a conservative bound on the number of children a Sibling can coutain for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? The root Siblings can have as many children as there are vertices (minus 1). And even if that one somehow cannot appear in coface computation, the Siblings of the first vertex can have n-2 (you can just add all possible triangles for instance).
If the size of this container is indeed related to Siblings (that would make sense for a BFS), we cannot use a static_vector. If instead it is bounded by the depth of the Simplex_tree (which would probably be the case for a DFS?), we could use 40 like we do elsewhere.

I don't know if there was a good reason for Clément to choose a BFS or if he picked this order arbitrarily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... You are right. From the test I performed, the size was not going up to 20 (with my benchmark), this is why I put 100. I don't know what was Clément intention, but let's go with a std::vector as timings are still good and it will be dynamic.
I don't think it will grow that much as the main code that is used is:

      if (!children_stack_.empty()) {
        sib_ = children_stack_.back();
        children_stack_.pop_back();
        sh_ = sib_->members().begin();  // don't track dimensions
        if (st_->has_children(sh_)) {
          children_stack_.push_back(st_->children(sh_));
        }
      }

I am using a std::vector from ca218a9

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say the "main" code is ++sh_ on the first line (iterate on all the nodes of this Siblings) and then the last else branch that adds the corresponding children. So it adds all the children to the stack before it starts looking at them (normal for BFS, although it could be done more implicitly) and the structure can contain roughly n_vertices * dimension objects. Of course, this structure will only grow large in cases where one vertex has many neighbors, which is a bad case for this optimization. But at least we want to avoid crashing in that case, even if it will be slow. So yes, vector (or stack<vector>).

Siblings* sib_; //
// use a bfs search to avoid calling sib_->members().find(.)
// A static_vector is quite faster than a queue
// 100 seems a conservative bound on the number of children a Sibling can coutain for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say the "main" code is ++sh_ on the first line (iterate on all the nodes of this Siblings) and then the last else branch that adds the corresponding children. So it adds all the children to the stack before it starts looking at them (normal for BFS, although it could be done more implicitly) and the structure can contain roughly n_vertices * dimension objects. Of course, this structure will only grow large in cases where one vertex has many neighbors, which is a bad case for this optimization. But at least we want to avoid crashing in that case, even if it will be slow. So yes, vector (or stack<vector>).

@@ -2001,6 +2175,7 @@ struct Simplex_tree_options_full_featured {
static const bool store_key = true;
static const bool store_filtration = true;
static const bool contiguous_vertices = false;
static const bool link_nodes_by_label = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new option needs to be documented in concept/SimplexTreeOptions.h. Maybe we should mention its existence in the doc of star_simplex_range and cofaces_simplex_range.
From its name, it sounds like Simplex_tree_options_full_featured should have this option (otherwise it gets as confusing as intmax_t being shorter than int128_t). On the other hand, a number of current uses of Simplex_tree_options_full_featured don't need the new option. That's something that requires some thought, so if we merge this branch, I want a blocking issue to force us to settle that before the next release.

Copy link
Contributor Author

@VincentRouvreau VincentRouvreau Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some documentation on dd26bff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9.0 GUDHI version 3.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants