Auto-generate unit tests & benchmarks

## tl;dr

We've gone way past the point where writing/maintaining highly redundant manual unit tests is any fun. If writing unit tests becomes tedious and a maintenance hell people start neglecting them instead. Let's thus make use of the fact that our APIs (and such the tests) almost all follow the pattern and automatically generate the tests for us, allowing us to increase test coverage even more, at actually far less overall cost.

## What?

A quick look at the [/Tests](https://github.com/Jounce/Surge/tree/master/Tests) reveals a suite of tests that all pretty much share the same pattern:

Our tests look something like this:

```swift
func test_<something>_float() {
    // Define a type-alias for convenience:
    typealias Scalar = Float

    // Create some dummy data:
    let lhs: [Scalar] = .monotonicNormalized()
    let rhs: [Scalar] = .monotonicNormalized()

    // Create a working copy of the dummy data:
    var actual: [Scalar] = lhs
    // Operate on the working copy:
    Surge.eladdInPlace(&actual, rhs)

    // Provide a ground-truth implementation to compare against:
    let expected = zip(lhs, rhs).map { $0 + $1 }

    // Compare the result:
    XCTAssertEqual(actual, expected, accuracy: 1e-8)
}
```

… only differentiating each other by a change in this line:

```swift
Surge.eladdInPlace(&actual, rhs)
```

… and this line:

```swift
let expected = zip(lhs, rhs).map { $0 + $1 }
```

And our  benchmarks look something like this:

```swift
// benchmarks:
func test_add_in_place_array_array_float() {
    // Call convenience function:
    measure_inout_array_array(of: Float.self) { measure in
        // Call XCTest's measurement method:
        measureMetrics([.wallClockTime], automaticallyStartMeasuring: false) {
            // Perform the actual operations to be measured:
            measure(Surge.eladdInPlace)
        }
    }
}
```

… which is semantically equivalent to the more verbose:

```swift
func test_add_in_place_array_array_float() {
    typealias Scalar = T

    let lhs = produceLhs()
    let rhs = produceRhs()

    // Call XCTest's measurement method:
    measureMetrics([.wallClockTime], automaticallyStartMeasuring: false) {
        var lhs = lhs
        
        startMeasuring()
        let _ = Surge.eladdInPlace(&lhs, rhs)
        stopMeasuring()
    }
}
```

… again, only differentiating each other by a change in this line:

```swift
let _ = Surge.eladdInPlace(&actual, rhs)
```

## Why?

At now shy **over 200 tests** and **over 60 benchmarks** maintenance of our tests/benchmarks suites has become quite a chore. 😣

So this got me thinking: What if … what if instead of writing and maintaining hundreds of highly redundant tests functions (for a lack of macros in Swift) we had a way to have the tests and even benchmarks generated auto-magically for us?

With this we could easily increase test coverage from "just the functions containing non-trivial logic" to "basically every public function, regardless of complexity", allowing us to catch regressions for even the most-trivial wrapper function, currently not covered at hardly any additional maintenance burden.

## How?

The basic idea is to get rid of all the existing unit tests and replace them with mere [Sourcery](https://github.com/krzysztofzablocki/Sourcery) annotations, like this:

```swift
// sourcery: test, floatAccuracy = 1e-5, expected = "add(array:array)"
public func add<L, R>(_ lhs: L, _ rhs: R) -> [Float] where L: UnsafeMemoryAccessible, R: UnsafeMemoryAccessible, L.Element == Float, R.Element == Float {
    // …
}
```

… given a fixture like this:

```swift
enum Fixture {
    enum Argument {
        func `default`<Scalar>() -> Scalar { … }
        func `default`<Scalar>() -> [Scalar] { … }
        func `default`<Scalar>() -> Vector<Scalar> { … }
        func `default`<Scalar>() -> Matrix<Scalar> { … }
    }
    enum Accuracy {
        func `default`() -> Float { … }
        func `default`() -> Double { … }
    }
    enum Expected {}
}

extension Fixture.Expected {
    func add<Scalar>(array lhs: [Scalar], array rhs: [Scalar]) -> [Scalar] {
        return zip(lhs, rhs).map { $0 + $1 }
    }
}
```

| Function Annotation                | Description                                                                  |
|------------------------------------|------------------------------------------------------------------------------|
| `test`                             | Generate test function (Optional)                                            |
| `bench`                            | Generate benchmark function (Optional)                                       |
| `expected = <function name>`                      | The fixture function to use as ground-truth (Required by `test`)             |
| `accuracy = <float literal>`       | A custom testing accuracy (Optional, used by `test`)                         |
| `floatAccuracy = <float literal>`  | A custom `Float`-specific testing accuracy (Optional, used by `test`)        |
| `doubleAccuracy = <float literal>` | A custom `Double`-specific testing accuracy (Optional, used by `test`)       |
| `arg<N> = <function name>`        | The fixture factory function for the nth argument (Optional, used by `test`) |
| …                                  | …                                                                            |

One would have [Sourcery](https://github.com/krzysztofzablocki/Sourcery) parse the ource code and generate a test suite per source file (or type extension, preferably), looking for `test` and `bench` annotations.

The current unit tests make use of minimal customization of `lhs`/`rhs` dummy values, so `arg<N>` will rarely find use, but a few tests need custom data to test against.

Also given that Surge has a rather restricted set of types that are to be expected as function arguments we should be able to match against them (`Scalar`, `Collection where Element == Scalar`, `Vector<Scalar>`, `Matrix<Scalar>`) rather naïvely, allowing us to elide most data we would otherwise have to specify explicitly.

Function Annotation	Description
`test`	Generate test function (Optional)
`bench`	Generate benchmark function (Optional)
`expected = <function name>`	The fixture function to use as ground-truth (Required by `test`)
`accuracy = <float literal>`	A custom testing accuracy (Optional, used by `test`)
`floatAccuracy = <float literal>`	A custom `Float`-specific testing accuracy (Optional, used by `test`)
`doubleAccuracy = <float literal>`	A custom `Double`-specific testing accuracy (Optional, used by `test`)
`arg<N> = <function name>`	The fixture factory function for the nth argument (Optional, used by `test`)
…	…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto-generate unit tests & benchmarks #145

tl;dr

What?

Why?

How?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Auto-generate unit tests & benchmarks #145

Description

tl;dr

What?

Why?

How?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions