|
| 1 | +# go-libddwaf |
| 2 | + |
| 3 | +This project's goal is to produce a higher level API for the go bindings to [libddwaf](https://github.com/DataDog/libddwaf): DataDog in-app WAF. |
| 4 | +It consists of 2 separate entities: the bindings for the calls to libddwaf, and the encoder whose job is to convert _any_ go value to its libddwaf object representation. |
| 5 | + |
| 6 | +An example usage would be: |
| 7 | + |
| 8 | +```go |
| 9 | +import waf "github.com/DataDog/go-libddwaf" |
| 10 | + |
| 11 | +//go:embed |
| 12 | +var ruleset []byte |
| 13 | + |
| 14 | +func main() { |
| 15 | + var parsedRuleset any |
| 16 | + |
| 17 | + if err := json.Unmarshal(ruleset, &parsedRuleset); err != nil { |
| 18 | + return 1 |
| 19 | + } |
| 20 | + |
| 21 | + wafHandle, err := waf.NewHandle(parsedRuleset, "", "") |
| 22 | + if err != nil { |
| 23 | + return 1 |
| 24 | + } |
| 25 | + |
| 26 | + defer wafHandle.Close() |
| 27 | + |
| 28 | + wafCtx := wafHandle.NewContext() |
| 29 | + defer wafCtx.Close() |
| 30 | + |
| 31 | + matches, actions := wafCtx.Run(map[string]any{ |
| 32 | + "server.request.path_params": "/rfiinc.txt", |
| 33 | + }, time.Minute) |
| 34 | +} |
| 35 | +``` |
| 36 | + |
| 37 | +The API documentation details can be found on [pkg.go.dev](https://pkg.go.dev/github.com/DataDog/go-libddwaf). |
| 38 | + |
| 39 | +Originally this project was only here to provide CGO Wrappers to the calls to libddwaf. |
| 40 | +But with the appearance of `ddwaf_object` tree like structure, |
| 41 | +but also with the intention to build CGO-less bindings, this project size has grown to be a fully integrated brick in the DataDog tracer structure. |
| 42 | +Which in turn made it necessary to document the project, to maintain it in an orderly fashion. |
| 43 | + |
| 44 | +## Design |
| 45 | + |
| 46 | +The WAF bindings have multiple moving parts that are necessary to understand: |
| 47 | + |
| 48 | +- Handle: a object wrapper over the pointer to the C WAF Handle |
| 49 | +- Context: a object wrapper over a pointer to the C WAF Context |
| 50 | +- Encoder: whose goal is to construct a tree of Waf Objects to send to the WAF |
| 51 | +- Allocator: Does all writing and allocation operations for the construction of Waf Objects |
| 52 | +- Decoder: Transforms Waf Objects returned from the WAF to usual go objects (e.g. maps, arrays, ...) |
| 53 | +- Library: The library which wraps all calls to C code |
| 54 | + |
| 55 | +```mermaid |
| 56 | +flowchart LR |
| 57 | +
|
| 58 | + START:::hidden -->|NewHandle| Handle -->|NewContext| Context |
| 59 | +
|
| 60 | + Context -->|Encode Inputs| Encoder |
| 61 | +
|
| 62 | + Handle -->|Encode Ruleset| Encoder |
| 63 | + Handle -->|Init WAF| Library |
| 64 | + Context -->|Decode Result| Decoder |
| 65 | +
|
| 66 | + Handle -->|Decode Init Errors| Decoder |
| 67 | +
|
| 68 | + Context -->|Run| Library |
| 69 | + Context -->|Store Go References| ContextAllocator |
| 70 | +
|
| 71 | + Encoder -->|Allocate Waf Objects| EncoderAllocator |
| 72 | +
|
| 73 | + EncoderAllocator -->|Copy after each encoding| ContextAllocator |
| 74 | +
|
| 75 | + Library -->|Call C code| libddwaf |
| 76 | +
|
| 77 | + classDef hidden display: none; |
| 78 | +``` |
| 79 | + |
| 80 | +### Allocator |
| 81 | + |
| 82 | +The cgoRefPool is a pure Go cgoRefPool of `ddwaf_object` C values on the Go memory heap. |
| 83 | +the `cgoRefPool` go type is a way to make sure we can safely send go allocated data to the C side of the WAF |
| 84 | +The main issue is the following: the `wafObject` uses a C union to store the tree structure of the full object, |
| 85 | +union equivalent in go are interfaces and they are not compatible with C unions. The only way to be 100% sure |
| 86 | +that the Go `wafObject` struct has the same layout as the C one is to only use primitive types. So the only way to |
| 87 | +store a raw pointer is to use the `uintptr` type. But since `uintptr` do not have pointer semantics (and are just |
| 88 | +basically integers), we need another structure to store the value as Go pointer because the GC is lurking. That's |
| 89 | +where the `cgoRefPool` object comes into play: all new `wafObject` elements are created via this API whose especially |
| 90 | +built to make sure there is no gap for the Garbage Collector to exploit. From there, since underlying values of the |
| 91 | +`wafObject` are either arrays (for maps, structs and arrays) or string (for all ints, booleans and strings), |
| 92 | +we can store 2 slices of arrays and use `runtime.KeepAlive` in each code path to protect them from the GC. |
| 93 | + |
| 94 | +### Typical call to Run() |
| 95 | + |
| 96 | +Here is an example of the flow of operations on a simple call to Run(): |
| 97 | + |
| 98 | +- Encode input data into Waf Objects |
| 99 | +- Lock the context mutex until the end of the call |
| 100 | +- Call `ddwaf_run` |
| 101 | +- Decode the matches and actions |
| 102 | + |
| 103 | +### CGO-less C Bindings |
| 104 | + |
| 105 | +The main component used to build C bindings without using CGO is called [purego](https://github.com/ebitengine/purego). The flow of execution on our side is to embed the C shared library using `go:embed`. Then to dump it into a file, load it using `dlopen` and to load the symbols using `dlsym`. And finally to call them. |
| 106 | + |
| 107 | +⚠️ Keep in mind that **purego only works on linux/darwin for amd64/arm64 and so does go-libddwaf.** |
| 108 | + |
| 109 | +Another requirement of `libddwaf` is to have a FHS filesystem on your machine and, for linux, to provide `libc.so.6`, `libpthread.so.0` and `libm.so.6`, `libdl.so.2` as dynamic libraries. |
| 110 | + |
| 111 | +## Contributing usual pitfalls |
| 112 | + |
| 113 | +- Cannot dlopen twice in the app lifetime on OSX |
| 114 | +- `runtime.KeepAlive()` calls are here to prevent the GC from destroying objects too early |
| 115 | +- Since there is a stack switch between the go code and the C code, usually the only C stacktrace you will ever get is from gdb |
| 116 | +- If a segfault happens during a call to the C code, the goroutine stacktrace which has done the call is the one annotated with `[syscall]`. |
| 117 | +- [GoLand](https://www.jetbrains.com/go/) does not support `CGO_ENABLED=0` (as of June 2023) |
| 118 | +- Keep in mind that we fully escape the type system. If you send the wrong data it will segfaults in the best cases but not always! |
| 119 | +- The structs in `ctypes.go` are here to reproduce the memory layout of the structs in `include/ddwaf.h` because pointer to these structs will be passed directly. |
| 120 | +- Do not use `uintptr` as function arguments or results types, coming from `unsafe.Pointer` casts of Go values, because they escape the pointer analysis which can create wrongly optimized code and crash. Pointer arithmetic is of course necessary in such a library but must be kept in the same function scope. |
0 commit comments