Skip to content

Commit

Permalink
Committing TBB 2019 Update 2 source code
Browse files Browse the repository at this point in the history
  • Loading branch information
tbbdev committed Nov 7, 2018
1 parent 4cebdd9 commit 8ff3697
Show file tree
Hide file tree
Showing 31 changed files with 750 additions and 96 deletions.
27 changes: 27 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,30 @@
The list of most significant changes made over time in
Intel(R) Threading Building Blocks (Intel(R) TBB).

Intel TBB 2019 Update 2
TBB_INTERFACE_VERSION == 11002

Changes (w.r.t. Intel TBB 2019 Update 1):

- Added constructors with HashCompare argument to concurrent_hash_map
(https://github.com/01org/tbb/pull/63).
- Added overloads for parallel_reduce with default partitioner and
user-supplied context.
- Added deduction guides for tbb containers: concurrent_vector,
concurrent_queue, concurrent_bounded_queue,
concurrent_priority_queue.
- Reallocation of memory objects >1MB now copies and frees memory if
the size is decreased twice or more, trading performance off for
reduced memory usage.
- After a period of sleep, TBB worker threads now prefer returning to
their last used task arena.

Bugs fixed:

- Fixed compilation of task_group.h when targeting macOS* 10.11 or
earlier (https://github.com/conda-forge/tbb-feedstock/issues/42).

------------------------------------------------------------------------
Intel TBB 2019 Update 1
TBB_INTERFACE_VERSION == 11001

Expand All @@ -27,6 +51,9 @@ Bugs fixed:
observer.
- Fixed compilation of task_group.h by Visual C++* 15.7 with
/permissive- option (https://github.com/01org/tbb/issues/53).
- Fixed tbb4py to avoid dependency on Intel(R) C++ Compiler shared
libraries.
- Fixed compilation for Anaconda environment with GCC 7.3 and higher.

------------------------------------------------------------------------
Intel TBB 2019
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Threading Building Blocks 2019 Update 1
[![Stable release](https://img.shields.io/badge/version-2019_U1-green.svg)](https://github.com/01org/tbb/releases/tag/2019_U1)
# Threading Building Blocks 2019 Update 2
[![Stable release](https://img.shields.io/badge/version-2019_U2-green.svg)](https://github.com/01org/tbb/releases/tag/2019_U2)
[![Apache License Version 2.0](https://img.shields.io/badge/license-Apache_2.0-green.svg)](LICENSE)

Threading Building Blocks (TBB) lets you easily write parallel C++ programs that take
Expand Down
2 changes: 1 addition & 1 deletion doc/Release_Notes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ Software - Supported Compilers
GNU Compilers (gcc) 4.1 - 7.1
GNU C Library (glibc) version 2.4 - 2.19
Xcode* 7.0 - 9.1
Android* NDK r10e - r16
Android* NDK r10e - r17b

Software - Supported Performance Analysis Tools

Expand Down
33 changes: 30 additions & 3 deletions include/tbb/concurrent_hash_map.h
Original file line number Diff line number Diff line change
Expand Up @@ -759,9 +759,19 @@ class concurrent_hash_map : protected internal::hash_map_base {
: internal::hash_map_base(), my_allocator(a)
{}

explicit concurrent_hash_map( const HashCompare& compare, const allocator_type& a = allocator_type() )
: internal::hash_map_base(), my_allocator(a), my_hash_compare(compare)
{}

//! Construct empty table with n preallocated buckets. This number serves also as initial concurrency level.
concurrent_hash_map( size_type n, const allocator_type &a = allocator_type() )
: my_allocator(a)
: internal::hash_map_base(), my_allocator(a)
{
reserve( n );
}

concurrent_hash_map( size_type n, const HashCompare& compare, const allocator_type& a = allocator_type() )
: internal::hash_map_base(), my_allocator(a), my_hash_compare(compare)
{
reserve( n );
}
Expand Down Expand Up @@ -800,7 +810,16 @@ class concurrent_hash_map : protected internal::hash_map_base {
//! Construction with copying iteration range and given allocator instance
template<typename I>
concurrent_hash_map( I first, I last, const allocator_type &a = allocator_type() )
: my_allocator(a)
: internal::hash_map_base(), my_allocator(a)
{
call_clear_on_leave scope_guard(this);
internal_copy(first, last, std::distance(first, last));
scope_guard.dismiss();
}

template<typename I>
concurrent_hash_map( I first, I last, const HashCompare& compare, const allocator_type& a = allocator_type() )
: internal::hash_map_base(), my_allocator(a), my_hash_compare(compare)
{
call_clear_on_leave scope_guard(this);
internal_copy(first, last, std::distance(first, last));
Expand All @@ -810,7 +829,15 @@ class concurrent_hash_map : protected internal::hash_map_base {
#if __TBB_INITIALIZER_LISTS_PRESENT
//! Construct empty table with n preallocated buckets. This number serves also as initial concurrency level.
concurrent_hash_map( std::initializer_list<value_type> il, const allocator_type &a = allocator_type() )
: my_allocator(a)
: internal::hash_map_base(), my_allocator(a)
{
call_clear_on_leave scope_guard(this);
internal_copy(il.begin(), il.end(), il.size());
scope_guard.dismiss();
}

concurrent_hash_map( std::initializer_list<value_type> il, const HashCompare& compare, const allocator_type& a = allocator_type() )
: internal::hash_map_base(), my_allocator(a), my_hash_compare(compare)
{
call_clear_on_leave scope_guard(this);
internal_copy(il.begin(), il.end(), il.size());
Expand Down
10 changes: 9 additions & 1 deletion include/tbb/concurrent_priority_queue.h
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ class concurrent_priority_queue {

//! Copy constructor
/** This operation is unsafe if there are pending concurrent operations on the src queue. */
explicit concurrent_priority_queue(const concurrent_priority_queue& src) : mark(src.mark),
concurrent_priority_queue(const concurrent_priority_queue& src) : mark(src.mark),
my_size(src.my_size), data(src.data.begin(), src.data.end(), src.data.get_allocator())
{
my_aggregator.initialize_handler(my_functor_t(this));
Expand Down Expand Up @@ -481,6 +481,14 @@ class concurrent_priority_queue {
}
};

#if __TBB_CPP17_DEDUCTION_GUIDES_PRESENT
// Deduction guide for the constructor from two iterators
template<typename InputIterator,
typename T = typename std::iterator_traits<InputIterator>::value_type,
typename A = cache_aligned_allocator<T>
> concurrent_priority_queue(InputIterator, InputIterator, const A& = A())
-> concurrent_priority_queue<T, std::less<T>, A>;
#endif /* __TBB_CPP17_DEDUCTION_GUIDES_PRESENT */
} // namespace interface5

using interface5::concurrent_priority_queue;
Expand Down
18 changes: 18 additions & 0 deletions include/tbb/concurrent_queue.h
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,15 @@ class concurrent_queue: public internal::concurrent_queue_base_v3<T> {
const_iterator unsafe_end() const {return const_iterator();}
} ;

#if __TBB_CPP17_DEDUCTION_GUIDES_PRESENT
// Deduction guide for the constructor from two iterators
template<typename InputIterator,
typename T = typename std::iterator_traits<InputIterator>::value_type,
typename A = cache_aligned_allocator<T>
> concurrent_queue(InputIterator, InputIterator, const A& = A())
-> concurrent_queue<T, A>;
#endif /* __TBB_CPP17_DEDUCTION_GUIDES_PRESENT */

template<typename T, class A>
concurrent_queue<T,A>::~concurrent_queue() {
clear();
Expand Down Expand Up @@ -439,6 +448,15 @@ class concurrent_bounded_queue: public internal::concurrent_queue_base_v8 {

};

#if __TBB_CPP17_DEDUCTION_GUIDES_PRESENT
// guide for concurrent_bounded_queue(InputIterator, InputIterator, ...)
template<typename InputIterator,
typename T = typename std::iterator_traits<InputIterator>::value_type,
typename A = cache_aligned_allocator<T>
> concurrent_bounded_queue(InputIterator, InputIterator, const A& = A())
-> concurrent_bounded_queue<T, A>;
#endif /* __TBB_CPP17_DEDUCTION_GUIDES_PRESENT */

template<typename T, class A>
concurrent_bounded_queue<T,A>::~concurrent_bounded_queue() {
clear();
Expand Down
19 changes: 19 additions & 0 deletions include/tbb/concurrent_vector.h
Original file line number Diff line number Diff line change
Expand Up @@ -1156,6 +1156,25 @@ class concurrent_vector: protected internal::allocator_base<T, A>,
};
};

#if __TBB_CPP17_DEDUCTION_GUIDES_PRESENT
// Deduction guide for the constructor from two iterators
template<typename I,
typename T = typename std::iterator_traits<I>::value_type,
typename A = cache_aligned_allocator<T>
> concurrent_vector(I, I, const A& = A())
-> concurrent_vector<T, A>;

// Deduction guide for the constructor from a vector and allocator
template<typename T, typename A1, typename A2>
concurrent_vector(const concurrent_vector<T, A1> &, const A2 &)
-> concurrent_vector<T, A2>;

// Deduction guide for the constructor from an initializer_list
template<typename T, typename A = cache_aligned_allocator<T>
> concurrent_vector(std::initializer_list<T>, const A& = A())
-> concurrent_vector<T, A>;
#endif /* __TBB_CPP17_DEDUCTION_GUIDES_PRESENT */

#if defined(_MSC_VER) && !defined(__INTEL_COMPILER)
#pragma warning (push)
#pragma warning (disable: 4701) // potentially uninitialized local variable "old"
Expand Down
18 changes: 18 additions & 0 deletions include/tbb/parallel_reduce.h
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,13 @@ void parallel_reduce( const Range& range, Body& body, affinity_partitioner& part
}

#if __TBB_TASK_GROUP_CONTEXT
//! Parallel iteration with reduction, default partitioner and user-supplied context.
/** @ingroup algorithms **/
template<typename Range, typename Body>
void parallel_reduce( const Range& range, Body& body, task_group_context& context ) {
internal::start_reduce<Range,Body,const __TBB_DEFAULT_PARTITIONER>::run( range, body, __TBB_DEFAULT_PARTITIONER(), context );
}

//! Parallel iteration with reduction, simple partitioner and user-supplied context.
/** @ingroup algorithms **/
template<typename Range, typename Body>
Expand Down Expand Up @@ -480,6 +487,17 @@ Value parallel_reduce( const Range& range, const Value& identity, const RealBody
}

#if __TBB_TASK_GROUP_CONTEXT
//! Parallel iteration with reduction, default partitioner and user-supplied context.
/** @ingroup algorithms **/
template<typename Range, typename Value, typename RealBody, typename Reduction>
Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction,
task_group_context& context ) {
internal::lambda_reduce_body<Range,Value,RealBody,Reduction> body(identity, real_body, reduction);
internal::start_reduce<Range,internal::lambda_reduce_body<Range,Value,RealBody,Reduction>,const __TBB_DEFAULT_PARTITIONER>
::run( range, body, __TBB_DEFAULT_PARTITIONER(), context );
return body.result();
}

//! Parallel iteration with reduction, simple partitioner and user-supplied context.
/** @ingroup algorithms **/
template<typename Range, typename Value, typename RealBody, typename Reduction>
Expand Down
20 changes: 18 additions & 2 deletions include/tbb/tbb_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
#endif

#if __clang__
/** according to clang documentation, version can be vendor specific **/
// according to clang documentation, version can be vendor specific
#define __TBB_CLANG_VERSION (__clang_major__ * 10000 + __clang_minor__ * 100 + __clang_patchlevel__)
#endif

Expand All @@ -65,6 +65,16 @@
#define __TBB_IOS 1
#endif

#if __APPLE__
#if __INTEL_COMPILER && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ > 1099 \
&& __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ < 101000
// ICC does not correctly set the macro if -mmacosx-min-version is not specified
#define __TBB_MACOS_TARGET_VERSION (100000 + 10*(__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ - 1000))
#else
#define __TBB_MACOS_TARGET_VERSION __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__
#endif
#endif

/** Preprocessor symbols to determine HW architecture **/

#if _WIN32||_WIN64
Expand Down Expand Up @@ -208,6 +218,7 @@
#define __TBB_ALIGNAS_PRESENT (__INTEL_CXX11_MODE__ && __INTEL_COMPILER >= 1500)
#define __TBB_CPP11_TEMPLATE_ALIASES_PRESENT (__INTEL_CXX11_MODE__ && __INTEL_COMPILER >= 1210)
#define __TBB_CPP14_INTEGER_SEQUENCE_PRESENT (__cplusplus >= 201402L)
#define __TBB_CPP17_DEDUCTION_GUIDES_PRESENT __INTEL_COMPILER > 1900
#elif __clang__
/** TODO: these options need to be rechecked **/
#define __TBB_CPP11_VARIADIC_TEMPLATES_PRESENT __has_feature(__cxx_variadic_templates__)
Expand Down Expand Up @@ -237,6 +248,7 @@
#define __TBB_ALIGNAS_PRESENT __has_feature(cxx_alignas)
#define __TBB_CPP11_TEMPLATE_ALIASES_PRESENT __has_feature(cxx_alias_templates)
#define __TBB_CPP14_INTEGER_SEQUENCE_PRESENT (__cplusplus >= 201402L)
#define __TBB_CPP17_DEDUCTION_GUIDES_PRESENT (__has_feature(__cpp_deduction_guides))
#elif __GNUC__
#define __TBB_CPP11_VARIADIC_TEMPLATES_PRESENT __GXX_EXPERIMENTAL_CXX0X__
#define __TBB_CPP11_VARIADIC_FIXED_LENGTH_EXP_PRESENT (__GXX_EXPERIMENTAL_CXX0X__ && __TBB_GCC_VERSION >= 40700)
Expand All @@ -262,6 +274,7 @@
#define __TBB_ALIGNAS_PRESENT (__GXX_EXPERIMENTAL_CXX0X__ && __TBB_GCC_VERSION >= 40800)
#define __TBB_CPP11_TEMPLATE_ALIASES_PRESENT (__GXX_EXPERIMENTAL_CXX0X__ && __TBB_GCC_VERSION >= 40700)
#define __TBB_CPP14_INTEGER_SEQUENCE_PRESENT (__cplusplus >= 201402L && __TBB_GCC_VERSION >= 50000)
#define __TBB_CPP17_DEDUCTION_GUIDES_PRESENT (__cpp_deduction_guides >= 201606)
#elif _MSC_VER
// These definitions are also used with Intel C++ Compiler in "default" mode (__INTEL_CXX11_MODE__ == 0);
// see a comment in "__INTEL_COMPILER" section above.
Expand All @@ -286,6 +299,7 @@
#define __TBB_ALIGNAS_PRESENT (_MSC_VER >= 1900)
#define __TBB_CPP11_TEMPLATE_ALIASES_PRESENT (_MSC_VER >= 1800)
#define __TBB_CPP14_INTEGER_SEQUENCE_PRESENT (_MSC_VER >= 1900)
#define __TBB_CPP17_DEDUCTION_GUIDES_PRESENT (_MSVC_LANG >= 201703L)
#else
#define __TBB_CPP11_VARIADIC_TEMPLATES_PRESENT 0
#define __TBB_CPP11_RVALUE_REF_PRESENT 0
Expand All @@ -306,6 +320,7 @@
#define __TBB_ALIGNAS_PRESENT 0
#define __TBB_CPP11_TEMPLATE_ALIASES_PRESENT 0
#define __TBB_CPP14_INTEGER_SEQUENCE_PRESENT (__cplusplus >= 201402L)
#define __TBB_CPP17_DEDUCTION_GUIDES_PRESENT 0
#endif

// C++11 standard library features
Expand Down Expand Up @@ -337,7 +352,8 @@

#define __TBB_CPP11_GET_NEW_HANDLER_PRESENT (_MSC_VER >= 1900 || __TBB_GLIBCXX_VERSION >= 40900 && __GXX_EXPERIMENTAL_CXX0X__ || _LIBCPP_VERSION)

#define __TBB_CPP17_UNCAUGHT_EXCEPTIONS_PRESENT (_MSC_VER >= 1900 || __GLIBCXX__ && __cpp_lib_uncaught_exceptions || _LIBCPP_VERSION >= 3700)
#define __TBB_CPP17_UNCAUGHT_EXCEPTIONS_PRESENT (_MSC_VER >= 1900 || __GLIBCXX__ && __cpp_lib_uncaught_exceptions \
|| _LIBCPP_VERSION >= 3700 && (!__TBB_MACOS_TARGET_VERSION || __TBB_MACOS_TARGET_VERSION >= 101200))

// std::swap is in <utility> only since C++11, though MSVC had it at least since VS2005
#if _MSC_VER>=1400 || _LIBCPP_VERSION || __GXX_EXPERIMENTAL_CXX0X__
Expand Down
2 changes: 1 addition & 1 deletion include/tbb/tbb_stddef.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
#define TBB_VERSION_MINOR 0

// Engineering-focused interface version
#define TBB_INTERFACE_VERSION 11001
#define TBB_INTERFACE_VERSION 11002
#define TBB_INTERFACE_VERSION_MAJOR TBB_INTERFACE_VERSION/1000

// The oldest major interface version still supported
Expand Down
25 changes: 13 additions & 12 deletions src/rml/server/thread_monitor.h
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class thread_monitor {
friend class thread_monitor;
tbb::atomic<size_t> my_epoch;
};
thread_monitor() : spurious(false), my_sema() {
thread_monitor() : skipped_wakeup(false), my_sema() {
my_cookie.my_epoch = 0;
ITT_SYNC_CREATE(&my_sema, SyncType_RML, SyncObj_ThreadMonitor);
in_wait = false;
Expand Down Expand Up @@ -129,9 +129,9 @@ class thread_monitor {
//! Detach thread
static void detach_thread(handle_type handle);
private:
cookie my_cookie;
tbb::atomic<bool> in_wait;
bool spurious;
cookie my_cookie; // epoch counter
tbb::atomic<bool> in_wait;
bool skipped_wakeup;
tbb::internal::binary_semaphore my_sema;
#if USE_PTHREAD
static void check( int error_code, const char* routine );
Expand Down Expand Up @@ -244,24 +244,25 @@ inline void thread_monitor::notify() {
}

inline void thread_monitor::prepare_wait( cookie& c ) {
if( spurious ) {
spurious = false;
// consumes a spurious posted signal. don't wait on my_sema.
my_sema.P();
if( skipped_wakeup ) {
// Lazily consume a signal that was skipped due to cancel_wait
skipped_wakeup = false;
my_sema.P(); // does not really wait on the semaphore
}
c = my_cookie;
in_wait = true;
__TBB_full_memory_fence();
in_wait.store<tbb::full_fence>( true );
}

inline void thread_monitor::commit_wait( cookie& c ) {
bool do_it = ( c.my_epoch == my_cookie.my_epoch);
bool do_it = ( c.my_epoch == my_cookie.my_epoch );
if( do_it ) my_sema.P();
else cancel_wait();
}

inline void thread_monitor::cancel_wait() {
spurious = ! in_wait.fetch_and_store( false );
// if not in_wait, then some thread has sent us a signal;
// it will be consumed by the next prepare_wait call
skipped_wakeup = ! in_wait.fetch_and_store( false );
}

} // namespace internal
Expand Down
Loading

0 comments on commit 8ff3697

Please sign in to comment.