Skip to content

Commit 08fdb65

Browse files
authored
Merge pull request #720 from goblint/library-dsl
OCaml DSL for specifying `LibraryFunctions`
2 parents 2d10d66 + 9ec8913 commit 08fdb65

29 files changed

+826
-421
lines changed
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Extending library
2+
3+
Analyzed programs often call functions for which the implementation is not available to Goblint.
4+
These can belong to the standard libraries of C, POSIX, Linux kernel, etc or to project dependencies whose implementation is not available.
5+
6+
Goblint outputs "Function definition missing" errors in such cases and by default assumes the worst about what the function does (invalidating everything).
7+
Such invalidation is a major source of imprecision and should be avoided at all cost (while still being sound).
8+
The semantics of unknown functions are controlled by options under `sem.unknown_function`, which can be used to avoid the invalidation, but disabling them if any unknown function does such invalidation will lead to unsoundness.
9+
10+
To strike a better balance between soundness and precision, more fine-grained specifications can be provided in the `LibraryFunctions` module.
11+
12+
13+
## Library function specifications
14+
15+
In the `LibraryFunctions` module (implementation), specifications of library functions are organized into a number of association lists.
16+
Functions are grouped based on their origin and context.
17+
This (in the future[^spec-list-selection]) allows groups of library function specifications to be toggled based on the program under analysis.
18+
19+
There are also older (less granular) specifications implemented using `classify` and `invalidate_actions`.
20+
No new specifications should be added there and all the existing ones should be migrated to the new DSL-based format.
21+
22+
[^spec-list-selection]: Specification list toggling will only be possible when all the old specifications are migrated.
23+
24+
25+
## Library function descriptor DSL
26+
27+
A library function's specification is a pair of its name and descriptor (of type `LibraryDesc.t`).
28+
The descriptors are easiest written using the OCaml DSL from `LibraryDsl`.
29+
30+
31+
### Function descriptor
32+
33+
A function descriptor can be one of the following:
34+
35+
1. `special` consists of an arguments descriptor and a continuation function for converting the captured arguments to a `LibraryDesc.special` variant.
36+
This is used when the library function should (or at least could) be specially handled by analyses.
37+
38+
2. `unknown` consists of just an arguments descriptor (which must `drop` all arguments) and defaults to `Unknown` variant.
39+
This is used when the library function isn't specially handled by analyses.
40+
41+
42+
### Arguments descriptor
43+
44+
An arguments descriptor is a list (with standard OCaml list syntax) of individual argument descriptors.
45+
Additionally for library functions with variadic arguments, the list may be terminated with `VarArg` constructor instead of `[]`.
46+
47+
48+
### Argument descriptor
49+
50+
An argument descriptor can be one of the following:
51+
52+
1. `__` captures the argument expression for use in `special`'s continuation function.
53+
2. `drop` ignores (drops) the argument expression to not be used by the continuation.
54+
55+
Both functions consist of an argument name string (for readability purposes) and a list of accesses the function does to the _pointer_ argument.
56+
57+
Unnamed variants of the functions `__'` and `drop'` also exist, but should be avoided.
58+
59+
60+
### Access
61+
62+
An access consists of an access kind and depth.
63+
Access kinds (`AccessKind.t`) are: read, write, free and spawn.
64+
65+
Accesses specify what the library function may do with the corresponding _pointer_ argument.
66+
Since function calls unconditionally read the immediate expressions given as arguments, there would be no point specifying that all the arguments (including of integer) type are read.
67+
Also, non-pointer arguments are copied (passed-by-value) to the function, so write accesses would make no sense.
68+
Therefore only pointer arguments should have accesses specified.
69+
70+
Two depths of accessing are distinguished:
71+
72+
1. Shallow (non-transitive, may-point-to) access means the function dereferences the pointer argument, but doesn't dereference any pointers within (e.g. if a pointer to struct is given as argument, but the struct contains further pointers).
73+
This is is almost always the case for standard library functions, since they take `void*` or `char*` as generic memory arguments and don't know about any pointers within the pointed block of memory.
74+
75+
2. Deep (transitive, reachable) access means the function follows (dereferences) pointers within.
76+
77+
In the DSL, shallow accesses can be specified by `r`, `w`, `f` or `s`, respectively. Deep accesses can be specified using the previous values with an additional `_deep` suffix.
78+
79+

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ theme:
1111
markdown_extensions:
1212
- toc:
1313
permalink: True
14+
- footnotes
1415

1516
dev_addr: '127.0.0.1:8010' # different port from default python http.server for g2html
1617

@@ -24,6 +25,7 @@ nav:
2425
- 'Developer guide':
2526
- developer-guide/developing.md
2627
- developer-guide/firstanalysis.md
28+
- developer-guide/extending-library.md
2729
- developer-guide/messaging.md
2830
- developer-guide/testing.md
2931
- developer-guide/debugging.md

src/analyses/accessAnalysis.ml

Lines changed: 31 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ struct
5050
ctx.sideg (lv_opt, ty) d
5151

5252
let do_access (ctx: (D.t, G.t, C.t, V.t) ctx) (kind:AccessKind.t) (reach:bool) (conf:int) (e:exp) =
53+
if M.tracing then M.trace "access" "do_access %a %a %B\n" d_exp e AccessKind.pretty kind reach;
5354
let open Queries in
5455
let part_access ctx (e:exp) (vo:varinfo option) (kind: AccessKind.t): MCPAccess.A.t =
5556
ctx.emit (Access {var_opt=vo; kind});
@@ -106,17 +107,18 @@ struct
106107
| _ ->
107108
add_access (conf - 60) None None
108109

109-
let access_one_top ?(force=false) ctx kind reach exp =
110-
(* ignore (Pretty.printf "access_one_top %b %b %a:\n" write reach d_exp exp); *)
110+
(** Three access levels:
111+
+ [deref=false], [reach=false] - Access [exp] without dereferencing, used for all normal reads and all function call arguments.
112+
+ [deref=true], [reach=false] - Access [exp] by dereferencing once (may-point-to), used for lval writes and shallow special accesses.
113+
+ [deref=true], [reach=true] - Access [exp] by dereferencing transitively (reachable), used for deep special accesses. *)
114+
let access_one_top ?(force=false) ?(deref=false) ctx (kind: AccessKind.t) reach exp =
115+
if M.tracing then M.traceli "access" "access_one_top %a %b %a:\n" AccessKind.pretty kind reach d_exp exp;
111116
if force || ThreadFlag.is_multi (Analyses.ask_of_ctx ctx) then (
112117
let conf = 110 in
113-
let write = match kind with
114-
| `Write | `Free -> true
115-
| `Read -> false
116-
in
117-
if reach || write then do_access ctx kind reach conf exp;
118-
Access.distribute_access_exp (do_access ctx) `Read false conf exp;
119-
)
118+
if deref then do_access ctx kind reach conf exp;
119+
Access.distribute_access_exp (do_access ctx Read false) conf exp;
120+
);
121+
if M.tracing then M.traceu "access" "access_one_top %a %b %a\n" AccessKind.pretty kind reach d_exp exp
120122

121123
(** We just lift start state, global and dependency functions: *)
122124
let startstate v = ()
@@ -129,18 +131,18 @@ struct
129131
let assign ctx lval rval : D.t =
130132
(* ignore global inits *)
131133
if !GU.global_initialization then ctx.local else begin
132-
access_one_top ctx `Write false (AddrOf lval);
133-
access_one_top ctx `Read false rval;
134+
access_one_top ~deref:true ctx Write false (AddrOf lval);
135+
access_one_top ctx Read false rval;
134136
ctx.local
135137
end
136138

137139
let branch ctx exp tv : D.t =
138-
access_one_top ctx `Read false exp;
140+
access_one_top ctx Read false exp;
139141
ctx.local
140142

141143
let return ctx exp fundec : D.t =
142144
begin match exp with
143-
| Some exp -> access_one_top ctx `Read false exp
145+
| Some exp -> access_one_top ctx Read false exp
144146
| None -> ()
145147
end;
146148
ctx.local
@@ -149,18 +151,19 @@ struct
149151
ctx.local
150152

151153
let special ctx lv f arglist : D.t =
152-
match (LF.classify f.vname arglist, f.vname) with
154+
let desc = LF.find f in
155+
match desc.special arglist, f.vname with
153156
(* TODO: remove cases *)
154157
| _, "_lock_kernel" ->
155158
ctx.local
156159
| _, "_unlock_kernel" ->
157160
ctx.local
158-
| `Lock (failing, rw, nonzero_return_when_aquired), _
161+
| Lock { try_ = failing; write = rw; return_on_success = nonzero_return_when_aquired; _ }, _
159162
-> ctx.local
160-
| `Unlock, "__raw_read_unlock"
161-
| `Unlock, "__raw_write_unlock" ->
163+
| Unlock _, "__raw_read_unlock"
164+
| Unlock _, "__raw_write_unlock" ->
162165
ctx.local
163-
| `Unlock, _ ->
166+
| Unlock _, _ ->
164167
ctx.local
165168
| _, "spinlock_check" -> ctx.local
166169
| _, "acquire_console_sem" when get_bool "kernel" ->
@@ -176,51 +179,34 @@ struct
176179
| _, "pthread_cond_wait"
177180
| _, "pthread_cond_timedwait" ->
178181
ctx.local
179-
| _, x ->
180-
let arg_acc act =
181-
match act, LF.get_threadsafe_inv_ac x with
182-
| _, Some fnc -> (fnc act arglist)
183-
| `Read, None -> arglist
184-
| (`Write | `Free), None ->
185-
if get_bool "sem.unknown_function.invalidate.args" then
186-
arglist
187-
else
188-
[]
189-
in
190-
(* TODO: per-argument reach *)
191-
let reach =
192-
match f.vname with
193-
| "memset" | "__builtin_memset" | "__builtin___memset_chk" -> false
194-
| "bzero" | "__builtin_bzero" | "explicit_bzero" | "__explicit_bzero_chk" -> false
195-
| "__builtin_object_size" -> false
196-
| _ -> true
197-
in
198-
List.iter (access_one_top ctx `Read reach) (arg_acc `Read);
199-
List.iter (access_one_top ctx `Write reach) (arg_acc `Write);
200-
List.iter (access_one_top ctx `Free reach) (arg_acc `Free);
182+
| _, _ ->
183+
LibraryDesc.Accesses.iter desc.accs (fun {kind; deep = reach} exp ->
184+
access_one_top ~deref:true ctx kind reach exp (* access dereferenced using special accesses *)
185+
) arglist;
201186
(match lv with
202-
| Some x -> access_one_top ctx `Write false (AddrOf x)
187+
| Some x -> access_one_top ~deref:true ctx Write false (AddrOf x)
203188
| None -> ());
189+
List.iter (access_one_top ctx Read false) arglist; (* always read all argument expressions without dereferencing *)
204190
ctx.local
205191

206192
let enter ctx lv f args : (D.t * D.t) list =
207193
[(ctx.local,ctx.local)]
208194

209195
let combine ctx lv fexp f args fc al =
210-
access_one_top ctx `Read false fexp;
196+
access_one_top ctx Read false fexp;
211197
begin match lv with
212198
| None -> ()
213-
| Some lval -> access_one_top ctx `Write false (AddrOf lval)
199+
| Some lval -> access_one_top ~deref:true ctx Write false (AddrOf lval)
214200
end;
215-
List.iter (access_one_top ctx `Read false) args;
201+
List.iter (access_one_top ctx Read false) args;
216202
al
217203

218204

219205
let threadspawn ctx lval f args fctx =
220206
(* must explicitly access thread ID lval because special to pthread_create doesn't if singlethreaded before *)
221207
begin match lval with
222208
| None -> ()
223-
| Some lval -> access_one_top ~force:true ctx `Write false (AddrOf lval) (* must force because otherwise doesn't if singlethreaded before *)
209+
| Some lval -> access_one_top ~force:true ~deref:true ctx Write false (AddrOf lval) (* must force because otherwise doesn't if singlethreaded before *)
224210
end;
225211
ctx.local
226212

src/analyses/apron/apronAnalysis.apron.ml

Lines changed: 23 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -309,45 +309,44 @@ struct
309309
)
310310
in
311311
let st = ctx.local in
312-
match LibraryFunctions.classify f.vname args with
312+
let desc = LibraryFunctions.find f in
313+
match desc.special args, f.vname with
313314
(* TODO: assert handling from https://github.com/goblint/analyzer/pull/278 *)
314-
| `Assert expression -> st
315-
| `Unknown "__goblint_check" -> st
316-
| `Unknown "__goblint_commit" -> st
317-
| `Unknown "__goblint_assert" -> st
318-
| `ThreadJoin (id,retvar) ->
315+
| Assert expression, _ -> st
316+
| Unknown, "__goblint_check" -> st
317+
| Unknown, "__goblint_commit" -> st
318+
| Unknown, "__goblint_assert" -> st
319+
| ThreadJoin { thread = id; ret_var = retvar }, _ ->
319320
(* nothing to invalidate as only arguments that have their AddrOf taken may be invalidated *)
320321
(
321322
let st' = Priv.thread_join ask ctx.global id st in
322323
match r with
323324
| Some lv -> invalidate_one st' lv
324325
| None -> st'
325326
)
326-
| _ ->
327+
| _, _ ->
327328
let ask = Analyses.ask_of_ctx ctx in
328329
let invalidate_one st lv =
329330
assign_to_global_wrapper ask ctx.global ctx.sideg st lv (fun st v ->
330331
let apr' = AD.forget_vars st.apr [V.local v] in
331332
assert_type_bounds apr' v (* re-establish type bounds after forget *)
332333
)
333334
in
334-
let st' = match LibraryFunctions.get_invalidate_action f.vname with
335-
| Some fnc -> st (* nothing to do because only AddrOf arguments may be invalidated *)
336-
| None ->
337-
(* nothing to do for args because only AddrOf arguments may be invalidated *)
338-
if GobConfig.get_bool "sem.unknown_function.invalidate.globals" then (
339-
let globals = foldGlobals !Cilfacade.current_file (fun acc global ->
340-
match global with
341-
| GVar (vi, _, _) when not (BaseUtil.is_static vi) ->
342-
(Var vi, NoOffset) :: acc
343-
(* TODO: what about GVarDecl? *)
344-
| _ -> acc
345-
) []
346-
in
347-
List.fold_left invalidate_one st globals
348-
)
349-
else
350-
st
335+
(* nothing to do for args because only AddrOf arguments may be invalidated *)
336+
let st' =
337+
if List.mem LibraryDesc.InvalidateGlobals desc.attrs then (
338+
let globals = foldGlobals !Cilfacade.current_file (fun acc global ->
339+
match global with
340+
| GVar (vi, _, _) when not (BaseUtil.is_static vi) ->
341+
(Var vi, NoOffset) :: acc
342+
(* TODO: what about GVarDecl? *)
343+
| _ -> acc
344+
) []
345+
in
346+
List.fold_left invalidate_one st globals
347+
)
348+
else
349+
st
351350
in
352351
(* invalidate lval if present *)
353352
match r with

0 commit comments

Comments
 (0)