-
Notifications
You must be signed in to change notification settings - Fork 1
(See description!) Pull request for repo [flow] branch [ipc-143_perf-dive-nov] to base-branch [main]. #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… in *all* cases relevant to) by that Flow-IPC perf deep-dive. Detailed notes forthcoming in PR.
… so setting the manual-install flag for that compiler/version.
…ign-compare (part deux).
…lated problem with `optional<>` and internally used `boost::thread_specific_ptr`. / (cont) Eliminate unit-test compile warning in clang: self-assignment of `auto` object.
…nonsensical array-bounds warning.
…t, swap; found in code review by echan; for some reason my brain glitched and told me I need not lock a const thing... which is embarrassing).
…uple minor opportunistic things.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes #84
fixes #66
fixes major part of #76
fixes good chunk of #82
flow::util::Thread_local_state<T>,Polled_shared_state<S>for advanced thread-local-state patterns.flow::util::Basic_blob<Allocator>support whenAllocatoris stateful.flow::util::Basic_blobet al can optionally (via new overloads) alloc-and-zero as opposed to just-alloc, bringing additional performance (also convenience).noexceptto certain APIs forflow::util::Basic_blobet al, resolving the stealth-flaw wherein a container thereof (e.g.,vector<Blob>) would invoke blob copying instead of moving. Also slightly decreasesizeof(Basic_blob)et al. flow: Addnoexceptto move ctors (at least); including to avoid copying in vector (at least). #76flow::util::Linked_hash_mapandLinked_hash_sethave new API overloads to support move-semantics at insertion. Also internally increased perf and memory use. Lastly removed APIs[pop_]front(),[pop_]back().flow::utiladded niceties including a couple new or improved arithmetic operations.flow::log::Log_context::set_logger()added (not thread-safe).flow::log::Log_context_mtwhich is thread-safe w/r/tset_logger()et al.flow::utilitems (Basic_blob,Blob_with_log_context,Linked_hash_map,Linked_hash_set, and miscellany includingostream_op_string(),Scoped_setter) and forflow::log::Log_context. flow: Unit tests (more of them) needed. #82nullptrover0,{}over()for constructor invocation. flow: Style: 0 -> nullptr. #66API notes
flow::util:Thread_local_state_registry<T>maintains a thread-local user object of typeT, created vianew Ton-demand anddeleted either when its thread is joined (exits), or the~Thread_local_state_registry()destructor executes -- whichever happens first. You may also, from any thread, useThread_local_state_registry::state_per_thread()to access all extant threads' existing thread-localTs.boost::thread_specific_ptr, which many users took advantage of before built-inthread_localstorage existed in C++.thread_specific_ptr<T>is intentionally simple; it resolutely refuses to provide any registry-of-extant-thread-local-data functionality, including the specific would-be feature wherein~thread_specific_ptr()dtor would clean up other threads' extantTs.Thread_local_state_registrythus works on top ofthread_specific_ptrto provide such advanced functionality. Use that functionality with care. The doc headers for the utility give advice on this.Polled_shared_state<Shared_state>, an optional companion toThread_local_state_registrythat enables a particular high-performance pattern, wherein one may "arm" an atomic flag per extant TLS-owning thread, and in each such thread quickly check this "armed-or-not" flag.Shared_statetemplate parameter can be used to communicate progress/state about whichever cross-thread operation the "armed-or-not" flags are meant to facilitate. One can use an emptystruct {}, if there is no need for such state, and the flag-set is sufficient.Basic_bloband sub-classBlob_with_log_contexttemplates:Basic_blob<Allocator>now supports statefulAllocators (such asboost::interprocess::allocator), whereas before this would not compile.Basic_blobandBlob_with_log_contextoverloads that do zero-initialize memory during buffer allocation:reserve(),resize(), andresize()ing ctor. These overloads take tag-typed valueCLEAR_ON_ALLOC.calloc()), versus allocating (a-lamalloc()) followed by zeroing (a-lamemset(0)).Linked_hash_mapandLinked_hash_setcontainer templates:Linked_hash_map::insert()andoperator[]()overloads with move-semantics for keys and, where applicable, mapped-values.Linked_hash_set::insert()overload with move-semantics for keys.KeyandHash.constexprfunction templateround_to_multiple(a, b)(e.g., ifb = 1024, thena = 0=> 1024,1=> 1024,1024=> 1024,1025=> 2048).ceil_div()is nowconstexprand slightly more flexible.std::span-like ops:Span,DYNAMIC_EXTENT.flow::log:Log_context::set_logger(): ability to (non-thread-safely) change the deriver'sLogger*.Log_context_mt: variant ofLog_contextwith the ability to thread-safely (via mutex) change the deriver'sLogger*. The mutex also protects assignment andswap().get_logger()function, this is intended for use when performance is irrelevant, or else if the perf-critical fast-paths zealously avoid log calls. Keep in mind this applies even to log-call-sites that do not pass the verbosity check, including everyFLOW_LOG_TRACE()call for example. Use with caution.flow::util::Linked_hash_{map|set}::front()andback()as well asLinked_hash_set::pop_{front|back}().Impl notes
flow::util::Linked_hash_mapandLinked_hash_setperf and memory-use upgrade: When inserting a new key, it is no longer stored in two locations internally and thus requires no copying. When using the new move-semantics overloads for insert-ops, no key copyability is required or used.detail/items:Linked_hash_key[_{hash|pred|set}]. The associative container used internally, for both public templates, is anunordered_setinstead ofunordered_map.flow::util::Linked_hash_mapandLinked_hash_setconstructors, whenn_buckets = -1is specified, use a less dubious and faster technique to wrangleunordered_*to use a default bucket count.flow::util::Basic_blobandBlob_with_log_contextsignificant stealth-flaw fixed: The lack ofnoexcepton certain ops caused containers of blobs (e.g.,vector<Blob>) to silently useBasic_blob/Blob_...copying instead of moving when needing to change their memory locations (e.g., whenvector<Blob>must realloc upon exceeding thevector's.capacity()).flow::util::Basic_blobandBlob_with_log_contextuse less space by internally employing the Empty Base Optimization (EBO). flow.util: Basic_blob - Empty base class optimization & members order optimization. #84flow::util::ostream_op_[to_]string()andfeed_args_to_ostream()impls are much shorter (use fold-expressions instead of compile-time recursion).Basic_blobinternal comments contained falsehoods, while others were overly verbose and confusing; they have been corrected or removed.Code review notes
@echan-dev reviewed this exact change elsewhere.