Binary cache: async push_success #908

autoantwort · 2023-02-15T13:52:30Z

This results in ~10-20% faster build times on my machine.

For example building boost on my M1 mac went down from 2.948 min to 2.375 min

autoantwort · 2023-02-15T21:19:32Z

How or when should "upload messages" (like Uploaded binaries to {count} {vendor}.) be printed?

Thomas1664 · 2023-02-16T14:24:53Z

Doesn't this have the same problem as #694 that the working thread might exit due to calls to check_exit or value_or_exit?

src/vcpkg/binarycaching.cpp

autoantwort · 2023-02-16T14:27:43Z

Doesn't this have the same problem as #694 that the working thread might exit due to calls to check_exit or value_or_exit?

Kind of. In general we need an option to decide if a binary cache failure should be a hard error or only a warning

Thomas1664 · 2023-02-16T14:44:30Z

Kind of. In general we need an option to decide if a binary cache failure should be a hard error or only a warning

The problem is that we almost never can be sure that there isn't some nested API call that exits on failure. But it seems like #909 at least partially addresses this issue.

autoantwort · 2023-02-16T14:46:12Z

Yeah but in the binary cache are nearly no hard exists. It currently also only prints warnings.

# Conflicts: # src/vcpkg.cpp

ras0219-msft

I like this direction; unblocking I/O work has great potential for making vcpkg much faster.

However we need to be very careful about the impacts of concurrency -- deadlocks suck :(

include/vcpkg/binarycaching.h

src/vcpkg/install.cpp

src/vcpkg/binarycaching.cpp

include/vcpkg/binarycaching.h

ras0219-msft · 2023-03-01T01:29:11Z

src/vcpkg.cpp

@@ -156,6 +157,7 @@ namespace vcpkg::Checks
    // Implements link seam from basic_checks.h
    void on_final_cleanup_and_exit()
    {
+        BinaryCache::wait_for_async_complete();


I do not think we can do this here. This is on the critical path for ctrl-c handling and should only be used for extremely fast, emergency tear-down behavior (like restoring the console).

If there happens to be an exit anywhere in any BinaryCache implementation, this would deadlock. Importantly, this include any sort of assertion we might want to do, like checking pointers for null.

Unfortunately, the only path forward I see is to call this (or appropriately scope the BinaryCache itself) at the relevant callers. The consequence of possibly not uploading some set of binary caches in the case of some unhandled program error (such as permissions issue on a directory expected to be writable) is vastly preferable to deadlocks.

I have changed the BinaryCache::wait_for_async_complete() implementation so it does not deadlock anymore.

I also moved the call to Checks::exit_with_code which is not called when crtl+c is handled. (I personally would like to have a way to terminate vcpkg but wait until the binary cache is done so that I don't lose progress.)

And I prefer it when build packages are uploaded to the binary caches before vcpkg exits because of an error, otherwise I have to build the already build packages again at a later point when there is no cache entry.

Agreed that it is, desirable to finish uploading on "understood" errors. For example, if a package failed to build or failed to be installed.

I was also wrong about my original assessment of a deadlock. My concern was the call path of the binary upload thread calling Checks::unreachable() or .value_or_exit(), but it seems that std::thread::join() does have a carve-out to handle this specific case: it will throw a resource_deadlock_would_occur if you try to join yourself.

I've put some other concerns below, but I don't want those to distract from my main point: We must make it as trivial / correct-by-construction as possible to guarantee that the binary cache thread NEVER attempts to wait on itself. I think the best approach for vcpkg right now is to add calls from Install::perform() etc to BinaryCache::wait_for_async_complete() before any "user-facing" error, such as the exit guarded by result.code != BuildResult::SUCCEEDED && keep_going == KeepGoing::NO. This is motivated by the perspective that it's always safer to terminate than to join and possibly deadlock / race condition / etc.

There's still a UB data race if the main thread and binary upload thread attempt to exit at the same time:

Concurrently calling join() on the same thread object from multiple threads constitutes a data race that results in undefined behavior.
-- https://en.cppreference.com/w/cpp/thread/thread/join

There's also a serious "scalability" problem if we ever want a second background thread for whatever reason, because BGThread A would join on BGThread B, while BGThread B tries to join on BGThread A. This might be solvable with ever more complex structures, such as a thread ownership DAG that gets threads to join only on their direct children, but I don't think the benefit is worth the cost.

The UB and the joining itself could simply be prevented by doing a if (std::this_thread::get_id() == instance->push_thread.get_id()). My concern with the explicit approach is that it is easy to forget to call the waiting function of the BinaryCache and every time you want to exit you have to remember to call it. This seems to me to be very prone to human error.

I have now implemented your request

@ras0219-msft Is there anything left that is preventing this PR from being merged?

src/vcpkg/binarycaching.cpp

Co-authored-by: Robert Schumacher <[email protected]>

…ages between package installs Co-authored-by: Robert Schumacher <[email protected]>

See microsoft#908 (comment)

# Conflicts: # src/vcpkg/build.cpp

# Conflicts: # src/vcpkg/base/messages.cpp

# Conflicts: # include/vcpkg/base/messages.h # src/vcpkg/base/messages.cpp

@JavierMatosD

…utput (#1565) Extensive overhaul of our downloads handling and console output; @JavierMatosD and I have gone back and forth several times and yet kept introducing unintended bugs in other places, which led me to believe targeted fixes would no longer cut it. Fixes many longstanding bugs and hopefully makes our console output for this more understandable: * We no longer print 'error' when an asset cache misses but the authoritative download succeeds. This partially undoes #1541. It is good to print errors immediately when they happen, but if a subsequent authoritative download succeeds we need to not print those errors. * We now always and consistently print output from x-script s at the time that actually happens. Resolves https://devdiv.visualstudio.com/DevDiv/_workitems/edit/2300063 * We don't tell the user that proxy settings might fix a hash mismatch problem. * We do tell the user that proxy settings might fix a download from asset cache problem. * We now always tell the user the full command line we tried when invoking an x-script that fails. * We don't crash if an x-script doesn't create the file we expect, or creates a file with the wrong hash. * We now always print what we are doing *before* touching the network, so if we hang the user knows which server is being problematic. Note that this includes storing back to asset caches which we were previously entirely silent about except in case of failure. Other changes: * Removed debug output about asset cache configuration. The output was misleading / wrong depending on readwrite settings, and echoing to the user exactly what they said before we've interpreted it is not useful debug output. (Contrast with other `VcpkgPaths` debug output which tend to be paths we have likely changed from something a user said) Other notes: * This makes all dependencies of #908 speak `DiagnosticContext` so it will be easy to audit that the foreground/background thread behavior is correct after this. * I did test the curl status parsing on old Ubuntu again. Special thanks to @JavierMatosD for his help in review of the first console output attempts and for blowing the dust off this area in the first place.

…cache-push-success # Conflicts: # include/vcpkg/base/fwd/message_sinks.h # include/vcpkg/base/message_sinks.h # src/vcpkg/base/message_sinks.cpp

…cache-push-success # Conflicts: # src/vcpkg/commands.install.cpp # src/vcpkg/commands.set-installed.cpp

… background thread.

…r move.

…ture/async-binary-cache-push-success # Conflicts: # include/vcpkg/binarycaching.h # src/vcpkg/binarycaching.cpp

src/vcpkg/binarycaching.cpp

…tion from the background thread.

… the work queue is drained before returning that no work is left.

* Restore autoantwort's only printing counts when done. * Note which specs we are submitting in messages from the background.

…hread

BillyONeal · 2025-02-03T18:35:35Z

@autoantwort I pushed some changes here, can you let me know if you are happy with them? Thanks!

BillyONeal · 2025-02-03T18:38:30Z

src/vcpkg/binarycaching.cpp

-    static ExpectedL<BinaryProviders> make_binary_providers(const VcpkgCmdArguments& args, const VcpkgPaths& paths)
+    void ReadOnlyBinaryCache::fetch(View<InstallPlanAction> actions)
+    {
+        std::vector<const InstallPlanAction*> action_ptrs;


This block is just moved up from 2325 as these things became members of ReadOnlyBinaryCache or BinaryCache rather than being local to this file now.

BillyONeal · 2025-02-03T18:38:57Z

src/vcpkg/binarycaching.cpp

+        });
+    }
+
+    void BinaryCacheSynchronizer::add_submitted() noexcept


This starts meaningfully new code.

autoantwort

LGTM

autoantwort · 2025-02-05T11:09:27Z

include/vcpkg/binarycaching.h

+        using backing_uint_t = std::conditional_t<sizeof(size_t) == 4, uint32_t, uint64_t>;
+        using counter_uint_t = std::conditional_t<sizeof(size_t) == 4, uint16_t, uint32_t>;


Why does this depend on size_t?

There are a lot of 64 bit machines without 32 bit atomics, and a lot of 32 bit machines without 64 bit atomics, and I wanted to choose something least likely to put us into lockful atomics world.

BillyONeal · 2025-02-05T22:21:30Z

Thanks for the contribution!

BillyONeal · 2025-02-14T02:23:25Z

Neumann-A · 2025-02-14T10:43:14Z

Why is this so inconsistent? I would have expected less variance in the result. Especially for stuff taking 1d and longer.

dg0yt · 2025-02-14T12:30:59Z

Does the artifact size with static linkage explain most of the inconsistency? In particular when ports install executables.

Neumann-A · 2025-02-14T12:42:15Z

Does the artifact size with static linkage explain most of the inconsistency? In particular when ports install executables.

Hmm maybe. The android triplets are mor ore less consistent and the -static and -static-md are also more or less consistent. @BillyONeal do you have storage data for the different triplets?

BillyONeal · 2025-02-14T16:25:50Z

The difference being mostly a function of how big the binary cache size is is my supposition as well. I don't have those stats though. For instance, the triplets with an LLVM have more improvement. The improvement for macOS seems bigger, which might be explained by not being in the same data center as the caches.

Probably stealth merge conflict in microsoft#908 and/or probably my fault.

Binary cache: async push_success

95f0438

autoantwort marked this pull request as draft February 15, 2023 20:04

Thomas1664 reviewed Feb 16, 2023

View reviewed changes

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved

autoantwort mentioned this pull request Feb 16, 2023

Unified object provider backend #911

Draft

Merge branch 'main' into feature/async-binary-cache-push-success

9d999d8

# Conflicts: # src/vcpkg.cpp

ras0219-msft requested changes Mar 1, 2023

View reviewed changes

autoantwort and others added 14 commits March 2, 2023 21:34

Merge branch 'main' into feature/async-binary-cache-push-success

163d9cd

Apply suggestions from code review

2a54205

Co-authored-by: Robert Schumacher <[email protected]>

Adapt code review

0912655

Update src/vcpkg/binarycaching.cpp

5d7288c

Co-authored-by: Robert Schumacher <[email protected]>

Adapt code review

10189ac

Remove unnecessary actions_to_push_notifier.notify_all()

2567607

Prevent deadlock and don't be on the crtl+c path

ecdd000

Add and use BGMessageSink to print IBinaryProvider::push_success mess…

8e7ae61

…ages between package installs Co-authored-by: Robert Schumacher <[email protected]>

Restore old upload message

850d7c9

Don't join yourself

548be38

Print messages about remaining packages to upload

6dbbf06

Localization

74b86fd

Improve messages

5171d3e

No singleton and explicit calls to wait_for_async_complete()

d69ed8f

See microsoft#908 (comment)

autoantwort marked this pull request as ready for review March 5, 2023 20:28

autoantwort added 4 commits March 8, 2023 19:22

Merge branch 'main' into feature/async-binary-cache-push-success

2df42d5

# Conflicts: # src/vcpkg/build.cpp

Merge branch 'main' into feature/async-binary-cache-push-success

5f1786e

# Conflicts: # src/vcpkg/base/messages.cpp

Merge branch 'main' into feature/async-binary-cache-push-success

93303c3

Merge branch 'main' into feature/async-binary-cache-push-success

8a26c8b

# Conflicts: # include/vcpkg/base/messages.h # src/vcpkg/base/messages.cpp

Merge branch 'main' into feature/async-binary-cache-push-success

061e6e8

BillyONeal added 6 commits January 14, 2025 17:56

Merge remote-tracking branch 'origin/main' into feature/async-binary-…

050c51f

…cache-push-success # Conflicts: # include/vcpkg/base/fwd/message_sinks.h # include/vcpkg/base/message_sinks.h # src/vcpkg/base/message_sinks.cpp

Merge remote-tracking branch 'origin/main' into feature/async-binary-…

8182732

…cache-push-success # Conflicts: # src/vcpkg/commands.install.cpp # src/vcpkg/commands.set-installed.cpp

Change find_last test to something that find (forward) won't pass.

139c7da

Collapse the background work queue system to handle completion of the…

4f410f1

… background thread.

Change BinaryCache and ZipTool's interface to avoid needing to copy o…

73b693a

…r move.

Merge remote-tracking branch 'BillyONeal/contextize-ziptool' into fea…

54fe17f

…ture/async-binary-cache-push-success # Conflicts: # include/vcpkg/binarycaching.h # src/vcpkg/binarycaching.cpp

BillyONeal reviewed Jan 31, 2025

View reviewed changes

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved

BillyONeal added 4 commits January 31, 2025 14:54

Fixed upload status being printed to the terminal without synchroniza…

27780f9

…tion from the background thread.

Use any_of, put a member FileSystem& back into BinaryCache, make sure…

8e15cf4

… the work queue is drained before returning that no work is left.

* Combine submitted/completed counts into one atomic.

21de05a

* Restore autoantwort's only printing counts when done. * Note which specs we are submitting in messages from the background.

Change submission count message slightly and avoid printing from BG t…

8f372b2

…hread

BillyONeal approved these changes Feb 3, 2025

View reviewed changes

BillyONeal marked this pull request as ready for review February 3, 2025 18:34

BillyONeal reviewed Feb 3, 2025

View reviewed changes

src/vcpkg/binarycaching.cpp

});

}

void BinaryCacheSynchronizer::add_submitted() noexcept

Copy link

Member

BillyONeal Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starts meaningfully new code.

Make the message count pettier.

8ecfdaa

autoantwort commented Feb 5, 2025

View reviewed changes

BillyONeal merged commit a6289e8 into microsoft:main Feb 5, 2025
6 checks passed

autoantwort deleted the feature/async-binary-cache-push-success branch February 15, 2025 15:14

BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Mar 6, 2025

Fix --clean-after-build + async push_success

a1ca94a

Probably stealth merge conflict in microsoft#908 and/or probably my fault.

BillyONeal mentioned this pull request Mar 6, 2025

Fix --clean-after-build + async push_success #1605

Closed

		using backing_uint_t = std::conditional_t<sizeof(size_t) == 4, uint32_t, uint64_t>;
		using counter_uint_t = std::conditional_t<sizeof(size_t) == 4, uint16_t, uint32_t>;

Binary cache: async push_success #908

Binary cache: async push_success #908

Uh oh!

Conversation

autoantwort commented Feb 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

autoantwort commented Feb 15, 2023

Uh oh!

Thomas1664 commented Feb 16, 2023

Uh oh!

Uh oh!

autoantwort commented Feb 16, 2023

Uh oh!

Thomas1664 commented Feb 16, 2023

Uh oh!

autoantwort commented Feb 16, 2023

Uh oh!

ras0219-msft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ras0219-msft Mar 1, 2023

Choose a reason for hiding this comment

Uh oh!

autoantwort Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

ras0219-msft Mar 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

autoantwort Mar 4, 2023

Choose a reason for hiding this comment

Uh oh!

autoantwort Mar 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

autoantwort Mar 18, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BillyONeal commented Feb 3, 2025

Uh oh!

BillyONeal Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BillyONeal Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

autoantwort left a comment

Choose a reason for hiding this comment

Uh oh!

autoantwort Feb 5, 2025

Choose a reason for hiding this comment

Uh oh!

BillyONeal Feb 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BillyONeal commented Feb 5, 2025

Uh oh!

BillyONeal commented Feb 14, 2025

Uh oh!

Neumann-A commented Feb 14, 2025

Uh oh!

dg0yt commented Feb 14, 2025

Uh oh!

Neumann-A commented Feb 14, 2025

Uh oh!

autoantwort commented Feb 15, 2023 •

edited

Loading

ras0219-msft Mar 3, 2023 •

edited

Loading

autoantwort Mar 5, 2023 •

edited

Loading

BillyONeal Feb 3, 2025 •

edited

Loading