Skip to content

Epeck thread safety#5402

Merged
lrineau merged 61 commits intoCGAL:masterfrom
mglisse:mt-glisse
Jul 27, 2021
Merged

Epeck thread safety#5402
lrineau merged 61 commits intoCGAL:masterfrom
mglisse:mt-glisse

Conversation

@mglisse
Copy link
Member

@mglisse mglisse commented Jan 27, 2021

Summary of Changes

A preview of a prototype of some changes to make Epeck safer in multi-threaded applications, in case some people want to play with it, look at the code, or discuss the topic. Alternative to #648, it makes more sense as a separate PR than a comment at the end of that PR.
It is surprising that the hardest part is cleaning up. If we were ok with not pruning the Lazy trees after update_exact (or if we had garbage collection?), other strategies would become possible. Anyway, call_once seems rather efficient and once_flag is small, at least on linux (it is bigger on windows), and it avoids reinventing the (spinning) wheel.
I think the worst overhead is for operations on Lazy_exact_nt, but people shouldn't do a lot of computations in that type.
One thing I hope is that the overhead will prove small enough that we don't need a thread-unsafe version for use in multi-threaded applications that do not share objects.
This change means that copying a Handle(_for) is now expensive, so work to reduce copies (for instance for mpq_rational) would also benefit reference-counted types (Gmpq, Lazy).

Some possible future steps:

  • benchmarks (single-thread), also check memory use. Careful that glibc-2.32+ may have very different perf than the rest.
  • write a test (for validity) and a benchmark with parallelism and a significant amount of contention
  • benchmark also on windows (and mac?), things like call_once could have wildly different behavior
  • test on non-x86 hardware (memory ordering on x86 is special, and thread sanitizer only goes so far)
  • discuss configuration: how should this be enabled? (CMake has the imported target Threads::Threads defined in FindThreads, this is already used in CGAL_TBB and CGAL_Eigen3 support files, see also THREADS_PREFER_PTHREAD_FLAG) Should there be a CMake option defined before find_package(CGAL) to enable it? Or a CGAL::Threads component that users need to request?
  • fix the FIXMEs, do the TODOs, etc

Release Management

That's cheaper than the current t=T() which includes several atomic ops
For the cases I changed, it was clear that T was Lazy_*. In Lazy.h,
there are other cases where one would need to check what the valid types
are, and maybe create a reset utility which does t=T() by default and is
overloaded for types that derive from Handle.
The exact choice of memory_order_* is hard, and even for versions that
seem to work on x86_64 TSAN complains like crazy.

TODO: a version that, instead of blocking if another thread is updating,
also does the computation, with suitable protection to ensure that
pruning the tree cannot happen while anyone is computing on it.

TODO: try having AT in Lazy instead of Lazy_rep (i.e. not shared) to
avoid the indirection. We can possibly choose one or the other depending
on the type.
This doesn't seem to provide anything useful with the current
implementations. It probably helped avoid an atomic operation in some
earlier version.
@lrineau
Copy link
Member

lrineau commented Jul 27, 2021

Benchs of this PR seems good. I will merge it.

@lrineau lrineau added the rm only: ready for master For the release team only: that indicates that a PR is about to be merged in 'master' label Jul 27, 2021
@lrineau lrineau merged commit b1ccf3f into CGAL:master Jul 27, 2021
@lrineau lrineau deleted the mt-glisse branch July 27, 2021 14:11
@lrineau lrineau removed the rm only: ready for master For the release team only: that indicates that a PR is about to be merged in 'master' label Jul 27, 2021
aabtop added a commit to aabtop/reify that referenced this pull request Sep 20, 2021
The issue is fixed by CGAL/cgal#5402.

Unfortunately it's not officially released yet, so this is
an internal build.
@Supranaturaler
Copy link

@mglisse Thank you for the update. Is it possible to apply this change to CGAL5.2.3?

@sloriot
Copy link
Member

sloriot commented Oct 29, 2021

Considering how invasive is the patch, I wonder what could prevent you from switching to the master branch (and 5.4 when released).

@lrineau
Copy link
Member

lrineau commented Oct 29, 2021

By the way, I think the change of this PR should be announced in CHANGES.md.

@Supranaturaler
Copy link

Considering how invasive is the patch, I wonder what could prevent you from switching to the master branch (and 5.4 when released).

I have tried 5.3 and find that boolean set operations become slower. I don't tend to upgrade right now. Is there any solution
to use exact_predicates_exact_constructions_kernel in multithread?

@sloriot
Copy link
Member

sloriot commented Oct 29, 2021

Could you share a case that shows the regression? We have been working recently to improve the runtime of boolean set operations so it might show a bug. Thanks.

@mglisse
Copy link
Member Author

mglisse commented Oct 29, 2021

Could you share a case that shows the regression?

Indeed, please open a new issue about this.

By the way, I think the change of this PR should be announced in CHANGES.md.

Uh, I thought we had written something, but apparently not...

@Supranaturaler
Copy link

Could you share a case that shows the regression? We have been working recently to improve the runtime of boolean set operations so it might show a bug. Thanks.

OK. I will test a simple case and make sure this is a problem. Later I will open a new issue about this.

@mglisse
Copy link
Member Author

mglisse commented Oct 29, 2021

CHANGES.md: in the Kernel_23 section:

  • Most operations on Exact_predicates_exact_constructions_kernel objects are now thread-safe if CGAL::Exact_rational is mpq_class (from GMPXX), boost::multiprecision::mpq_rational or CGAL::Quotient<CGAL::MP_Float>. The objects are not atomic though, so the usual restrictions on avoiding race conditions apply. For users who do not use threads, this can be disabled with CGAL_HAS_NO_THREADS.

and in the section "dD Geometry Kernel":

  • Most operations on Epeck_d objects are now thread-safe, see 2D and 3D Linear Geometry Kernel for details.

?

("most" because for instance Aff_transformation_3 still has an unsafe reference count, and I am not sure if there may be other hidden problems)

@lrineau
Copy link
Member

lrineau commented Oct 29, 2021

CHANGES.md: in the Kernel_23 section:

  • Most operations on Exact_predicates_exact_constructions_kernel objects are now thread-safe if CGAL::Exact_rational is mpq_class (from GMPXX), boost::multiprecision::mpq_rational or CGAL::Quotient<CGAL::MP_Float>. The objects are not atomic though, so the usual restrictions on avoiding race conditions apply. For users who do not use threads, this can be disabled with CGAL_HAS_NO_THREADS.

and in the section "dD Geometry Kernel":

  • Most operations on Epeck_d objects are now thread-safe, see 2D and 3D Linear Geometry Kernel for details.

?

("most" because for instance Aff_transformation_3 still has an unsafe reference count, and I am not sure if there may be other hidden problems)

Where do you see that? It seems it was never committed, or never pushed.

@mglisse
Copy link
Member Author

mglisse commented Oct 29, 2021

Where do you see that?

Ah no, sorry, that wasn't clear enough, I was suggesting that we could add that, not saying that it was already there.

@Supranaturaler
Copy link

Supranaturaler commented Oct 29, 2021

Could you share a case that shows the regression? We have been working recently to improve the runtime of boolean set operations so it might show a bug. Thanks.
@sloriot @mglisse

Please refer to #6092 . As I can't upgrade to higher version, I wonder if I could merge some of the code in #5402 to CGAL 5.2.3 to solve the multithread problem? If so, which code should I merge?

@sloriot
Copy link
Member

sloriot commented Oct 29, 2021

I already did that merge somewhere but I'd need some time to get a non-official patched release.

@Supranaturaler
Copy link

I already did that merge somewhere but I'd need some time to get a non-official patched release.

Thank you for your hard work. The patched release of CGAL 5.2.3 is sorely needed for me.

@sloriot
Copy link
Member

sloriot commented Oct 29, 2021

@Supranaturaler You can try https://cgal.geometryfactory.com/~sloriot/CGAL-5.2.2-mt.tgz. I hope I did not mess up while setting it up. Note that it is not official and patched master should be the best choice.

@Supranaturaler
Copy link

Supranaturaler commented Oct 29, 2021

@Supranaturaler You can try https://cgal.geometryfactory.com/~sloriot/CGAL-5.2.2-mt.tgz. I hope I did not mess up while setting it up. Note that it is not official and patched master should be the best choice.

Sorry, the url is unreacheable.
image

@Supranaturaler
Copy link

@sloriot plz offer the right url to me, thanks!

@sloriot
Copy link
Member

sloriot commented Nov 2, 2021

Should be fine now.

@Supranaturaler
Copy link

Should be fine now.

Compile error. You should remove line 217 in Lazy_kernel.h
image

Anyway, it worked in multithread test. Thank you for all this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concurrently readable lazy-evaluated number type in CGAL Read-safe multithreading with Lazy_exact_nt

9 participants