Description
The gosec supports now besides AST based rule also SSA analyzers. The SSA code representation can be leveraged to build a taint analysis engine which can uncover more complex security issues.
More details about the SSA representation of Go code and the list of instructions available can be found in the go docs.
The ssadump
tool can be used to print the SSA representation of a program as follows:
go install golang.org/x/tools/cmd/ssadump@latest
cat > main.go << EOF
package main
import "fmt"
func main() {
fmt.Println("Hello SSA!)
}
EOF
ssadump -build F main.go
# Name: command-line-arguments.init
# Package: command-line-arguments
# Synthetic: package initializer
func init():
0: entry P:0 S:2
t0 = *init$guard bool
if t0 goto 2 else 1
1: init.start P:1 S:1
*init$guard = true:bool
t1 = fmt.init() ()
jump 2
2: init.done P:2 S:0
return
# Name: command-line-arguments.main
# Package: command-line-arguments
# Location: /usr/local/google/home/ccojocar/go/src/github.com/securego/samples/ssa/main.go:5:6
func main():
0: entry P:0 S:0
t0 = new [1]any (varargs) *[1]any
t1 = &t0[0:int] *any
t2 = make any <- string ("Hello SSA!":string) any
*t1 = t2
t3 = slice t0[:] []any
t4 = fmt.Println(t3...) (n int, err error)
return
There are different options for -build
available (consult ssadump -h).
The SSA representation is compiled at the function unit.
The taint analysis engine should be able to track data(input arguments) through the call graph from a predefined sink function to a predefined list of source functions/variables/types form where the data is initially inputed, and make a judgement if this data was sanitized or not on its way through the call graph. In this manner, some security issues such as command injection, SSRF, path traversal, SQLi can be detected more reliably.
A source can originate in different places such as global variables, functions or types. These are the places from where the data is inputed initially into the program. This is a list of potential source candidates:
sources:
# Sources that are defined in Go documentation as a "variable" (note: these variables will have an SSA type of "Global").
variables:
"os":
- "Args"
# Sources that are defined in Go documentation as a "function"
functions:
"flag":
- "Arg"
- "Args"
"os":
- "Environ"
- "File"
"crypto/tls":
- "LoadX509KeyPair"
- "X509KeyPair"
"os/user":
- "Lookup"
- "LookupId"
- "Current"
"crypto/x509":
- "Subjects"
"io":
- "ReadAtLeast"
- "ReadFull"
"database/sql":
- "Query"
- "QueryRow"
"bytes":
- "String"
- "ReadBytes"
- "ReadByte"
"bufio":
- "Text"
- "Bytes"
- "ReadString"
- "ReadSlice"
- "ReadRune"
- "ReadLine"
- "ReadBytes"
- "ReadByte"
"archive/tar":
- "Next"
- "FileInfo"
- "Header"
"net/url":
- "ParseQuery"
- "ParseUriRequest"
- "Parse"
- "Query"
# Sources that are defined in Go documentation as a "type" (note: adding types will consider all functions that use that type to be tainted).
types:
"net/http":
- "Request"
For example to detect a SSRF issue, the taint analysis engine will track the data (e.g. the URL value) from a sink such as net/http.Get/Do/Head/Post/PostForm
or net/http.Client.Do/Get/Head/Post
through the call graph back into one of the predefined sources from the list above (e.g. os.Args
). If it discovers that the data comes from one of these predefined sources and it was not sanitized, then it can raise with more confidence an SSRF issue.
One challenge to tackle is to build the call graph using SSA representation starting from a sink function back to a source. The SSA representation breaks down the code representation into function units. This will require that the arguments of each function to be track from one SSA function unit to another in order to build the complete graph. And also beyond package boundaries since typically a program contains multiple packages. The analysis should stop when a call goes out of program own packages, otherwise the complexity and time will explode and not be visible in due time.