Skip to content

Conversation

@0xkarmacoma
Copy link
Collaborator

ideally we would solve this after parsing the command line and before entering tests, but unfortunately this approach does not work because solvers can be overridden at the test level. So we could analyze the config when it's created at the different levels (command line, contract, function...) but this seems overly complex, and we may also run contracts and tests concurrently in the future so this may break.

Instead let's use a simple lock on ensure_solver_available, captured in the Java-style synchronized decorator. This time the flow is that ensure_solver_available can only be entered by one thread at a time, therefore:

  • the first time we find a missing solver, we start the download
  • other concurrent queries are waiting
  • once the download is finished and the command is resolved, the lock is released
  • the concurrent queries get the resolved command, without having to trigger the download again

fixes #568

ideally we would solve this after parsing the command line and before entering tests, but unfortunately this approach does not work because solvers can be overridden at the test level. So we could analyze the config when it's created at the different levels (command line, contract, function...) but this seems overly complex, and we may also run contracts and tests concurrently in the future so this may break.

Instead let's use a simple lock on `ensure_solver_available`, captured in the Java-style `synchronized` decorator. This time the flow is that `ensure_solver_available` can only be entered by one thread at a time, therefore:
- the first time we find a missing solver, we start the download
- other concurrent queries are waiting
- once the download is finished and the command is resolved, the lock is released
- the concurrent queries get the resolved command, without having to trigger the download again

fixes #568
@0xkarmacoma 0xkarmacoma requested a review from Copilot July 28, 2025 23:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses race conditions in solver availability checking by introducing thread synchronization to prevent multiple concurrent downloads of the same solver binary. The fix uses a Java-style synchronized decorator to ensure only one thread can execute the solver availability check at a time.

  • Adds a synchronized decorator using threading locks to prevent race conditions
  • Applies synchronization to the ensure_solver_available function to prevent concurrent solver downloads

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/halmos/utils.py Adds a synchronized decorator implementation using threading.RLock()
src/halmos/solvers.py Applies the @synchronized() decorator to ensure_solver_available function

@0xkarmacoma 0xkarmacoma requested a review from daejunpark July 29, 2025 00:04
return None


@synchronized()
Copy link
Collaborator

@daejunpark daejunpark Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess the performance impact of introducing a lock here shouldn't be a concern, because find_solver_binary should be quick, so the lock wait time for other processes would be negligible, right? (i also ran performance tests with 9 solver threads, and didn't see any meaningful regression.)

that said, i'm wondering if we could move the synchronized decorator to install_solver() to minimize any potential performance overhead.

while this function is always executed, install_solver is typically not. so if we add a check at the beginning of install_solver (possibly by calling find_solver_binary again) to see whether the solver is already installed, we could safely apply the synchronized decorator there. this way, locking would be avoided in most cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this but it's pretty negligible:

python -m timeit --setup 'import threading; lock = threading.RLock()' 'with lock: pass'         
2000000 loops, best of 5: 199 nsec per loop

200ns to acquire and release the lock, so close to no overhead in most cases. I don't expect it to register when profiling, but we can adjust if it does

@0xkarmacoma 0xkarmacoma merged commit 7516619 into main Jul 29, 2025
138 of 147 checks passed
@0xkarmacoma 0xkarmacoma deleted the fix/concurrent-downloads branch July 29, 2025 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

race condition with solver download

3 participants