Skip to content

Commit

Permalink
update the quickstart/faq
Browse files Browse the repository at this point in the history
  • Loading branch information
segasai committed Sep 8, 2022
1 parent b948909 commit 2236388
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
5 changes: 5 additions & 0 deletions docs/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -464,3 +464,8 @@ If those quick fixes don't work, feel free to raise an issue.
However, as multi-threading and multi-processing are notoriously
difficult to debug, especially on a problem I'm not familiar with,
it's likely that I might not be able to help all that much.


** How to decide on the number of processes in a pool and how to set queue_size**

Assuming that you decided on the number of live-points K that you want to use and that the likelihood evaluation is not very quick, you should use as many processes as you can up to around K. The queue_size should be equal the number of processes. If you are using the the number of processes that M is smaller than K, you may want to use :math:`M=K//2` or :math:`M=K//3` i.e integer fractions. So if you are using 1024 live-points all powers of two up to 1024 would be good choiceS for the number of processes.
7 changes: 3 additions & 4 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,7 @@ argument::
# initialize sampler with pool with pre-defined queue
sampler = NestedSampler(loglike, ptform, ndim, pool=pool, queue_size=8)

There is *no* reason to set queue_size to anything other then the number of parallel processes in the pool.
Parallel operations in `dynesty` are done by simply swapping in the
`pool.map` function over the default `map` function when making likelihood
calls. Note that this is a *synchronous* function call, which requires that
Expand All @@ -423,16 +424,14 @@ The reason why "parallel" is written in quotes above is that while function
evaluations can be made in parallel, live point proposals must be done serially
in order to avoid breaking the statistical properties of Nested Sampling.
Assuming we are using :math:`M` processes with :math:`K` live points, this
leads to sub-linear scaling :math:`S` of the form
leads to sub-linear speed improvements :math:`S` of the form
(`Handley et al. 2015 <https://arxiv.org/pdf/1506.00171.pdf>`_):

.. math::
S(M, K) = K \ln \left(1 + \frac{M}{K}\right)
This scales pretty linearly as long as the number of processes is much smaller
than the number of live points, but falls off as the pool becomes relatively
larger.
This scales pretty linearly with the number of processes till the number of parallel processes is equal or larger than the number of live-points.

Depending on where the bottleneck of the computation lies, the provided
`pool` can be disabled during certain function evaluations (e.g., when
Expand Down

0 comments on commit 2236388

Please sign in to comment.