feat(coprocessor): add a non-blocking, distributed locking mechanism in tfhe-worker#1550
Merged
antoniupop merged 33 commits intorelease/0.10.xfrom Dec 29, 2025
Merged
Conversation
🧪 CI InsightsHere's what we observed from your CI run for afdfe6b. 🟢 All jobs passed!But CI Insights is watching 👀 |
Contributor
|
values.yaml new parameters are missing |
fe5a63f to
dd53922
Compare
b08dd5d to
a3453db
Compare
Contributor
Author
|
See also: #1506 (comment) |
687e931 to
e63b299
Compare
Collaborator
Please could you update the charts with these (or any new params added) - I think we've mostly converged on the arch, so would be good to start planning for deployment. |
b642f1b to
c35b36c
Compare
46dcec4 to
4402c9e
Compare
rudy-6-4
previously approved these changes
Dec 24, 2025
antoniupop
previously approved these changes
Dec 28, 2025
458b71d to
de42ee5
Compare
…iple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id
…leted computations
…ertext_digest tables
9d4b718 to
afbf0a1
Compare
* fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: rudy <rudy.sicard@zama.ai>
afbf0a1 to
afdfe6b
Compare
Collaborator
antoniupop
added a commit
that referenced
this pull request
Jan 6, 2026
…oprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
antoniupop
added a commit
that referenced
this pull request
Jan 6, 2026
…oprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
antoniupop
added a commit
that referenced
this pull request
Jan 6, 2026
…oprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
antoniupop
added a commit
that referenced
this pull request
Jan 7, 2026
…oprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
antoniupop
added a commit
that referenced
this pull request
Jan 8, 2026
…oprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
mergify Bot
pushed a commit
that referenced
this pull request
Jan 9, 2026
* feat(coprocessor): schedule computations along dependence chains in coprocessor (#1550) * feat(coprocessor): create dependence_chain table * feat(coprocessor): coordinate dependence-chain processing across multiple workers It provides a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers. A worker can acquire ownership of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach). Ownership expires after a timeout, enabling work-stealing by other workers. New CLI param --worker_id * test(coprocessor): ensure both acquire_next_lock and work-stealing features * fix(coprocessor): fix work-stealing when a lock has expired - Added LockingReason for logging - Make expiry configurable * fix(coprocessor): update the flow of acquire/extend/release chain_id lock * chore(coprocessor): update sqlx cache * chore(coprocessor): improve logging for dcid locking * chore(coprocessor): disable fallback for dependence_chain_id locking * fix(coprocessor): update in-memory lock info on extend_current_lock * fix(coprocessor): lock another dcid and continue processing * chore(coprocessor): update sqlx cache * chore(coprocessor): add idx_dependence_chain_processing_by_worker * chore(coprocessor): observe query timings in tfhe-worker - add --dcid_ttl_sec config - add otel traces for dcid * chore(coprocessor): implement both max_lock_ttl_sec and disable_dcid_locking options * chore(coprocessor): update sqlx cache * chore(coprocessor): update last_updated_at when releasing a lock * chore(coprocessor): support --dcid-timeslice-sec CLI param, tfhe-worker * chore(coprocessor): solve dcid unit-tests issue * chore(coprocessor): enable lock re-acquisition once the timeslice has been exceeded * chore(coprocessor): enable default timeslice * chore(coprocessor): run cleanup procedure to delete old processed dcids * chore(coprocessor): acquire locks only on DCIDs that are ready for computation * chore(coprocessor): update charts with new tfhe-worker args * chore(coprocessor): handle case no-dcid-available * chore(coprocessor): bump chart version * chore(coprocessor): notify work available if dependency count reaches zero * fix(coprocessor): add dependence chain index on last_updated when processed * fix(coprocessor): update is_completed only where a CT is inserted in DB * fix(coprocessor): restrict update of computation completion to uncompleted computations * fix(coprocessor): prevent dependence cycle overestimation on trivial encrypt handles * fix(coprocessor): update test for completion of processing of dcid * fix(coprocessor): add missing partial indexes on ciphertexts and ciphertext_digest tables * feat(coprocessor): add transaction dependence chains in HL (#1651) * fix(coprocessor): host-listener, dependency chain * fix(coprocessor): fix to squash, duplicated trivial encrypt * fix(coprocessor): fix to squash, duplicated trivial encrypt, test * fix(coprocessor): fix to squash, scalars are not handles * fix(coprocessor): fix to squash, cargo fmt * feat(coprocessor): topologic timestamp * fix(coprocessor): host-listener, reject cycle and describe out of order dependencies * fix(coprocessor): do not update dependence chain timestamp on row update * fix(coprocessor): host-listener, bad condition for need to sort tx * feat(coprocessor): host-listener, dependency_count for dependency_chain * feat(coprocessor): host-listener, dependents for dependency_chain * fix(coprocessor): restrict dependence counter to block scope * fix(coprocessor): do not update dependence chain last_updated_at on release * fix(coprocessor): emit warning only when dependence chain is missing dependences * feat(coprocessor): host-listener, dependency_chain as connected component * fix(coprocessor): host-listener, update last_updated_at de chain when already processed * fix(coprocessor): deprecate schedule order in TFHE worker * chore(coprocessor): fix CI * fix(coprocessor): host-listener, params for dependency chain policy * fix(coprocessor): hist-listener, dependency_chain, cycle detection * chore(coprocessor): update charts for new params * chore(coprocessor): fix TFHE worker CI test test_extend_or_release_lock_2 --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> Co-authored-by: Antoniu Pop <90181190+antoniupop@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai> * fix(coprocessor): db migration, improve indexing for sns worker fetching work (#1692) * fix(coprocessor): db migration, improve indexing for sns worker fetching work * fix(coprocessor): add missing indexes for selecting allowed handles when tx unsent --------- Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai> * feat(coprocessor): add mechanism to release dependence chains when no progress (#1696) * fix(coprocessor): do not update is_completed on unallowed handles * feat(coprocessor): add mechanism to release dependence chains when no progress * fix(coprocessor): remove obsolete row lock on computations * feat(coprocessor): set created_at as topological order within block * fix(coprocessor): chain release and update * chore(coprocessor): update charts * fix(coprocessor): fix top timestamp for tx * fix(coprocessor): update earliest schedule order * fix(coprocessor): remove adding epsilon to timestamp when releasing chain * fix(coprocessor): split dependence chains after forks instead of before --------- Co-authored-by: rudy <rudy.sicard@zama.ai> * fix(coprocessor): add missing indexes on verify_proofs and dependence_chain tables (#1715) * fix(coprocessor): db-migration, first clean on more obvious unused index (#1722) * fix(coprocessor): align host listener and poller dependence params (#1728) --------- Co-authored-by: goshawk-3 <76947196+goshawk-3@users.noreply.github.com> Co-authored-by: rudy <rudy.sicard@zama.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers replicas.
A worker can acquire a lock of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach).
A permission to acquire a DCID depends on either
dependency_count is 0 and DCID is not locked
or
dependency_count is 0 and DCID is locked but the lock has expired
Ownership expires after a timeout, enabling work-stealing by other workers for resilience.
GC procedure is regularly executed to clean up processed DCIDs