Skip to content

HB-relationship involving thread creations while mutexes are held#1913

Open
dabund24 wants to merge 58 commits into
goblint:masterfrom
dabund24:descendant-locksets
Open

HB-relationship involving thread creations while mutexes are held#1913
dabund24 wants to merge 58 commits into
goblint:masterfrom
dabund24:descendant-locksets

Conversation

@dabund24
Copy link
Copy Markdown
Member

@dabund24 dabund24 commented Jan 17, 2026

second part of #1805. The first half was implemented in #1865.
closes #1805.

Summary

Simplest case: After creating $t_1$ in $t_0$ with mutex $l$ held, succeeding statements until maybe unlocking in $t_0$ must happen before everything after definitely locking $l$ in $t_1$.

generalizations:

  • $t_1$ can be any descendant of $t_0$ as long as $t_0$ is a must-ancestor.
  • It doesn't matter if locking happens in $t_1$ or a must-ancestor as long as that thread is also a must-ancestor of the thread created in $t_0$

Examples

In the following examples, A must happen before B.

Simple example

graph TB;
subgraph t1;
    E["lock(l);"]-->F;
    F["unlock(l);"]-->G;
    G((B))
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C;
    C((A))-->D;
    D["unlock(l);"];
end;
B-.->E
Loading

B in a descendant of $t_1$

graph TB;
subgraph t2;
    H((B))
end;
subgraph t1;
    E["lock(l);"]-->F;
    F["create(t2);"]
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C;
    C((A))-->D;
    D["unlock(l);"];
end;
B-.->E
F-.->H
Loading

A in a descendant of $t_0$

graph TB;
subgraph t1;
    E["lock(l);"]-->I;
    I["unlock(l);"]-->F;
    F((B));
end;
subgraph t2;
    H((A))
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C;
    C["create(t2);"]-->D;
    D["join(t2);"]-->G;
    G["unlock(l);"]
end;
B-.->E
C-.->H
H-.->D
Loading

Here, it is important that no unlock happens in $t_0$ before $t_2$ is joined into $t_0$, which was computed in #1865.

Dependency Analyses

  • $t_{\mathrm{ego}}$: Ego Thread Id at program point
  • $\mathcal L$: Must-Lockset at program point
  • $\mathcal C$: May-Creates of ego thread before program point
  • $\mathcal J$: Transitive Must-Joins of ego thread before program point
  • $\mathcal{DES}\ t$: Descendant threads of $t$ (implemented in this PR)
  • $\mathcal{ANC}\ t$: Must-ancestors of $t$

From these analyses, we compute:

  • Given a statement create(t) all threads transitively created, for which $t_{\mathrm{ego}}$ is a must-ancestor:
    $$c^* \ t:= \set{t _ d\mid t _ d\in \set{t} \cup \mathcal{DES}\ t, t_{\mathrm{ego}}\in\mathcal{ANC}\ t_d}$$
  • All possibly running descendants, for which $t_{\mathrm{ego}}$ is a must-ancestor:
    $$\mathcal{R}:= \set{t_ r\mid t_ r\in \left(\mathcal{C}\cup \bigcup{\set{\mathcal{DES} c\mid c\in\mathcal{C}}}\right)\setminus \mathcal J, t_{\mathrm{ego}}\in\mathcal{ANC}\ t_r}$$

Analyses

Descendant Locksets $\mathcal{DL}$

  • flow-sensitive

  • Domain: $T\to 2^L$

  • $T\to 2^L$ is MapBot

  • $2^L$ is Must-Set

  • $\mathcal{DL}=\set{t_1\mapsto \set{l}}$ means

  • There must have existed at least one create($t_c$) statement in $t_A$ with $t_B\in c^*\ t_c$.

  • For all of those create($t_c$) statements, $\mathit{l}\in\mathcal{L}$.

  • We must not have encountered an unlock($l$) statement after having detected the thread creation.

Transfer functions

  • $\mathsf{init}^\sharp = \emptyset$
  • $\mathsf{new}^\sharp\ X = \emptyset$
  • $[[\mathsf{create}(t)]]^\sharp\ X=X\sqcup\set{t_d\mapsto \mathcal{L}\mid t_d\in c^*\ t}$
  • $[[\mathsf{unlock}(l)]]^\sharp\ X=\set{t\mapsto L\setminus\set{l}\mid t\mapsto L\in X}$
  • $[[\mathsf{unlock}(?)]]^\sharp\ X=\set{t\mapsto \emptyset\mid t\mapsto L\in X}$

Mustlock History $\mathcal{LH}$

  • flow-sensitive
  • Domain: $L\to 2^T$
  • $T\to 2^T$ is MapTop
  • $2^T$ is Must-Set
  • $\mathcal{LH}=\set{l\mapsto \set{t}}$ means "before the next operation, mutex $l$ must have been locked in $t$"

Transfer functions

  • $\mathsf{init}^\sharp=\emptyset$
  • $\mathsf{new}^\sharp\ X=X$
  • $[[\mathsf{lock}(l)]]^\sharp\ X=X\oplus\set{l\mapsto (X\ l)\cup \set{t_{\mathrm{ego}}}}$

Global descendant lockset $\mathcal{DL}_g\ t$

  • flow-insensitive with V$=T$
  • Domain: $T\to T\to 2^L$
  • $T\to T\to 2^L$ and $T\to 2^L$ are MapBot
  • $\mathcal{DL}G\ t=\set{t{\mathrm{anc}}\mapsto DL}$ means "throughout the entire execution of $t$, the descendant lockset $DL$ is valid in $t_{\mathrm{anc}}$".

Contributions

We only contribute at create($t_c$) statements for all $t_d\in t^*\ t_c$:
$$DL_{t_d}:=\set{t\mapsto (\mathcal{DL}\ t)\cap (\mathcal{CL}\ t_d\ t_{\mathrm{ego}})\mid t\in T}$$
$$\mathcal{DL}_ g\ t_ d\sqsupseteq \set{t_ {\mathrm{ego}} \mapsto DL_ {t_d}}$$

Happened-Before rules

Statement s2 with $\mathcal{LH}_ 2, t_2$ must happen after s1 with $\mathcal{DL}_ 1, \mathcal{LH}_ 1, t_1$, if:

  • $\exists t_2\mapsto L_a\in\mathcal{DL}_ 1, l_ {LH}\mapsto T_ {LH}\in \mathcal{LH}_ 1, t_ {LH}\in T_ {LH}:$
    $l_{LH}\in L_a\land t_1\in \mathcal{ANC}\ t_{LH}\land(t_{LH}\in\mathcal{ANC}\ \mathcal t_2\lor t_{LH}=t_2)$ or
  • $\exists (t_X\mapsto DL_{t_1})\in\mathcal{DL}_ g\ t_1$ such that the rule above holds replacing $t_ 1$ by $t_ X$ and $\mathcal{DL}_ 1$ by $DL_ {t_1}$.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the second part of happens-before (HB) relationship analysis for thread creations while mutexes are held. It introduces two new analyses (MustlockHistory and DescendantLockset) that work together to detect race conditions by establishing happens-before relationships between thread operations based on mutex locking patterns.

Changes:

  • Added MustlockHistory analysis to track which threads have locked specific mutexes
  • Added DescendantLockset analysis to compute descendant locksets and determine happens-before relationships
  • Extended CreationLockset analysis with query support for integration with the new analyses
  • Added 20 comprehensive test cases covering both race-free and racing scenarios

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/analyses/mustlockHistory.ml New analysis tracking mutex lock history per thread
src/analyses/descendantLockset.ml New analysis computing descendant locksets and HB relationships
src/analyses/creationLockset.ml Added CreationLockset query support
src/domains/queries.ml Added CreationLockset and MustlockHistory query types with supporting domains
src/goblint_lib.ml Exported the two new analysis modules
tests/regression/53-races-mhp/40-45-*.c Race-free test cases validating correct HB detection
tests/regression/53-races-mhp/50-59-*.c Racing test cases validating race detection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/regression/53-races-mhp/45-dl_multiple_mutexes_racefree.c Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/creationLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread tests/regression/53-races-mhp/44-dl_cl_transitive_create_racefree.c
Comment thread tests/regression/53-races-mhp/59-dl_multiple_mutexes_racing.c Outdated
@michael-schwarz
Copy link
Copy Markdown
Member

Random thought: What do your analyses do for recursive mutexes? Are they sound in these cases?

@dabund24
Copy link
Copy Markdown
Member Author

Thanks for bringing this up, those would never have crossed my mind. I think the analyses remain sound, but get less precise.

The only relevant thing changing here coming to my mind is the fact that after an unlock, we can't assume anymore that the mutex is now unlocked. As unlock statements have been places in our analyses, where we assume things to just break, but not start/keep working, this wouldn't be an issue.

@michael-schwarz
Copy link
Copy Markdown
Member

I think the analyses remain sound, but get less precise.

👍 Could you add tests here and maybe also include some tests for your first analysis (potentially in a separate PR)?

@dabund24
Copy link
Copy Markdown
Member Author

I re-added the $\mathcal{DL}_g$ analysis, as using MapBot as the domain turned out to work fine when writing the thesis

@dabund24 dabund24 marked this pull request as ready for review February 23, 2026 12:23
Copy link
Copy Markdown
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the stall here. I think we should now try to get this merged so it evolves with the rest of the system. Can you merge master into this (or rebase, whatever you prefer) and address the comments?

Then it should be good to merge!

Comment thread tests/regression/53-races-mhp/42-dl_cl_simple_racefree.c Outdated
Comment thread tests/regression/53-races-mhp/44-dl_cl_transitive_create_racefree.c Outdated
Comment thread tests/regression/53-races-mhp/60-dl_cl_multiple_creates_racing.c Outdated
Comment thread src/analyses/descendantLockset.ml
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml
Comment thread src/analyses/descendantLockset.ml
Comment thread src/analyses/creationLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread tests/regression/53-races-mhp/46-dl_recursive_mutex.c Outdated
Comment thread tests/regression/53-races-mhp/53-dl_maybe_unlock_parent_racing.c Outdated
Comment thread tests/regression/53-races-mhp/57-dl_multiple_creates_conditional_racing.c Outdated
Comment thread tests/regression/53-races-mhp/58-dl_cl_unlock_before_join_racing.c Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment on lines +42 to +45
(* intersect locksets, but return bot if any arg is bot *)
let lockset_inter_sticky_bot = function
| `Top, _ | _, `Top -> Lockset.bot ()
| ls1, ls2 -> Lockset.inter ls1 ls2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Lockset.inter somehow incorrect or why is this necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really interesting point. What I wanted to achieve here is explicitly checking which mutexes are included in both locksets (where besides $\emptyset$, $\bot$ also means that no mutex is included). So in theory, this expresses what we actually want here.

In practice, though, this will never happen (at least with how the analyses work right now), as in both cases, $\bot$ means that a thread has never been created at all from some other thread (meaning $\mathcal{CL}\ t_1\ t_{\mathrm{ego}} = \bot\iff \mathcal{DL}\ t_1 =\bot$). So Lockset.inter should also work here, even if a little by chance.

Not sure how to proceed here. If you strongly lean towards using Lockset.inter, I can change this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are must mutexes, right? So

`Top

which is the same as bot () is actually the full set of mutexes, not $\emptyset$.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I phrased this in a somewhat sloppy way. In my analyses, $\bot$ actually does not represent "all mutexes", but "no mutexes" is not 100% spot on either. Rather, it means that some "may/ $\exists$ -condition" is not satisfied, which is the bare minimum for any mutex to be included in the lockset.
For instance, in the case of the creation lockset, a mutex cannot possibly protect a thread $t_1$ from another thread $t_0$ if $t_0$ never creates $t_1$ (transitively). However, we cannot encode this with $\emptyset$, because we want to be able to "eliminate" this state by joining it with something. In must-locksets this is not possible. Using $\bot$ for this works perfectly, though

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue I see these two representations to possibly clash is when reading out the must-lockset query. This would fortunately only be a precision problem, however, I admit that I have not given this a lot of thought

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the implementation this is fine for now, for the writeup we may have to think how to separate those things cleanly.

Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/descendantLockset.ml Outdated
Comment thread src/analyses/threadDescendants.ml
@sim642 sim642 added this to the v2.8.0 Clumsy Clurichaun milestone May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider some more interactions between thread creation, joins, and mutexes

5 participants