|
| 1 | +# Base-2 Exponential Histogram |
| 2 | + |
| 3 | +## Design |
| 4 | + |
| 5 | +This is a fixed-size data structure for aggregating the OpenTelemetry |
| 6 | +base-2 exponential histogram introduced in [OTEP |
| 7 | +149](https://github.com/open-telemetry/oteps/blob/main/text/0149-exponential-histogram.md) |
| 8 | +and [described in the metrics data |
| 9 | +model](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/datamodel.md#exponentialhistogram). |
| 10 | +The exponential histogram data point is characterized by a `scale` |
| 11 | +factor that determines resolution. Positive scales correspond with |
| 12 | +more resolution, and negatives scales correspond with less resolution. |
| 13 | + |
| 14 | +Given a maximum size, in terms of the number of buckets, the |
| 15 | +implementation determines the best scale possible given the set of |
| 16 | +measurements received. The size of the histogram is configured using |
| 17 | +the `WithMaxSize()` option, which defaults to 160. |
| 18 | + |
| 19 | +The implementation here maintains the best resolution possible. Since |
| 20 | +the scale parameter is shared by the positive and negative ranges, the |
| 21 | +best value of the scale parameter is determined by the range with the |
| 22 | +greater difference between minimum and maximum bucket index: |
| 23 | + |
| 24 | +```golang |
| 25 | +func bucketsNeeded(minValue, maxValue float64, scale int32) int32 { |
| 26 | + return bucketIndex(maxValue, scale) - bucketIndex(minValue, scale) + 1 |
| 27 | +} |
| 28 | + |
| 29 | +func bucketIndex(value float64, scale int32) int32 { |
| 30 | + return math.Log(value) * math.Ldexp(math.Log2E, scale) |
| 31 | +} |
| 32 | +``` |
| 33 | + |
| 34 | +The best scale is uniquely determined when `maxSize/2 < |
| 35 | +bucketsNeeded(minValue, maxValue, scale) <= maxSize`. This |
| 36 | +implementation maintains the best scale by rescaling as needed to stay |
| 37 | +within the maximum size. |
| 38 | + |
| 39 | +## Layout |
| 40 | + |
| 41 | +### Mapping function |
| 42 | + |
| 43 | +The `mapping` sub-package contains the equations specified in the [data |
| 44 | +model for Exponential Histogram data |
| 45 | +points](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#exponentialhistogram). |
| 46 | + |
| 47 | +There are two mapping functions used, depending on the sign of the |
| 48 | +scale. Negative and zero scales use the `mapping/exponent` mapping |
| 49 | +function, which computes the bucket index directly from the bits of |
| 50 | +the `float64` exponent. This mapping function is used with scale `-10 |
| 51 | +<= scale <= 0`. Scales smaller than -10 map the entire normal |
| 52 | +`float64` number range into a single bucket, thus are not considered |
| 53 | +useful. |
| 54 | + |
| 55 | +The `mapping/logarithm` mapping function uses `math.Log(value)` times |
| 56 | +the scaling factor `math.Ldexp(math.Log2E, scale)`. This mapping |
| 57 | +function is used with `0 < scale <= 20`. The maximum scale is |
| 58 | +selected because at scale 21, simply, it becomes difficult to test |
| 59 | +correctness--at this point `math.MaxFloat64` maps to index |
| 60 | +`math.MaxInt32` and the `math/big` logic used in testing breaks down. |
| 61 | + |
| 62 | +### Data structure |
| 63 | + |
| 64 | +The `structure` sub-package contains a Histogram aggregator for use by |
| 65 | +the OpenTelemetry-Go Metrics SDK as well as OpenTelemetry Collector |
| 66 | +receivers, processors, and exporters. |
| 67 | + |
| 68 | +## Implementation |
| 69 | + |
| 70 | +The implementation maintains a slice of buckets and grows the array in |
| 71 | +size only as necessary given the actual range of values, up to the |
| 72 | +maximum size. The structure of a single range of buckets is: |
| 73 | + |
| 74 | +```golang |
| 75 | +type buckets struct { |
| 76 | + backing bucketsVarwidth[T] // for T = uint8 | uint16 | uint32 | uint64 |
| 77 | + indexBase int32 |
| 78 | + indexStart int32 |
| 79 | + indexEnd int32 |
| 80 | +} |
| 81 | +``` |
| 82 | + |
| 83 | +The `backing` field is a generic slice of `[]uint8`, `[]uint16`, |
| 84 | +`[]uint32`, or `[]uint64`. |
| 85 | + |
| 86 | +The positive and negative backing arrays are independent, so the |
| 87 | +maximum space used for `buckets` by one `Aggregator` is twice the |
| 88 | +configured maximum size. |
| 89 | + |
| 90 | +### Backing array |
| 91 | + |
| 92 | +The backing array is circular. The first observation is counted in |
| 93 | +the 0th index of the backing array and the initial bucket number is |
| 94 | +stored in `indexBase`. After the initial observation, the backing |
| 95 | +array grows in either direction (i.e., larger or smaller bucket |
| 96 | +numbers), until rescaling is necessary. This mechanism allows the |
| 97 | +histogram to maintain the ideal scale without shifting values inside |
| 98 | +the array. |
| 99 | + |
| 100 | +The `indexStart` and `indexEnd` fields store the current minimum and |
| 101 | +maximum bucket number. The initial condition is `indexBase == |
| 102 | +indexStart == indexEnd`, representing a single bucket. |
| 103 | + |
| 104 | +Following the first observation, new observations may fall into a |
| 105 | +bucket up to `size-1` in either direction. Growth is possible by |
| 106 | +adjusting either `indexEnd` or `indexStart` as long as the constraint |
| 107 | +`indexEnd-indexStart < size` remains true. |
| 108 | + |
| 109 | +Bucket numbers in the range `[indexBase, indexEnd]` are stored in the |
| 110 | +interval `[0, indexEnd-indexBase]` of the backing array. Buckets in |
| 111 | +the range `[indexStart, indexBase-1]` are stored in the interval |
| 112 | +`[size+indexStart-indexBase, size-1]` of the backing array. |
| 113 | + |
| 114 | +Considering the `aggregation.Buckets` interface, `Offset()` returns |
| 115 | +`indexStart`, `Len()` returns `indexEnd-indexStart+1`, and `At()` |
| 116 | +locates the correct bucket in the circular array. |
| 117 | + |
| 118 | +### Determining change of scale |
| 119 | + |
| 120 | +The algorithm used to determine the (best) change of scale when a new |
| 121 | +value arrives is: |
| 122 | + |
| 123 | +```golang |
| 124 | +func newScale(minIndex, maxIndex, scale, maxSize int32) int32 { |
| 125 | + return scale - changeScale(minIndex, maxIndex, scale, maxSize) |
| 126 | +} |
| 127 | + |
| 128 | +func changeScale(minIndex, maxIndex, scale, maxSize int32) int32 { |
| 129 | + var change int32 |
| 130 | + for maxIndex - minIndex >= maxSize { |
| 131 | + maxIndex >>= 1 |
| 132 | + minIndex >>= 1 |
| 133 | + change++ |
| 134 | + } |
| 135 | + return change |
| 136 | +} |
| 137 | +``` |
| 138 | + |
| 139 | +The `changeScale` function is also used to determine how many bits to |
| 140 | +shift during `Merge`. |
| 141 | + |
| 142 | +### Downscale function |
| 143 | + |
| 144 | +The downscale function rotates the circular backing array so that |
| 145 | +`indexStart == indexBase`, using the "3 reversals" method, before |
| 146 | +combining the buckets in place. |
| 147 | + |
| 148 | +### Merge function |
| 149 | + |
| 150 | +`Merge` first calculates the correct final scale by comparing the |
| 151 | +combined positive and negative ranges. The destination aggregator is |
| 152 | +then downscaled, if necessary, and the `UpdateByIncr` code path to add |
| 153 | +the source buckets to the destination buckets. |
| 154 | + |
| 155 | +### Scale function |
| 156 | + |
| 157 | +The `Scale` function returns the current scale of the histogram. |
| 158 | + |
| 159 | +If the scale is variable and there are no non-zero values in the |
| 160 | +histogram, the scale is zero by definition; when there is only a |
| 161 | +single value in this case, its scale is MinScale (20) by definition. |
| 162 | + |
| 163 | +If the scale is fixed because of range limits, the fixed scale will be |
| 164 | +returned even for any size histogram. |
| 165 | + |
| 166 | +### Handling subnormal values |
| 167 | + |
| 168 | +Subnormal values are those in the range [0x1p-1074, 0x1p-1022), these |
| 169 | +being numbers that "gradually underflow" and use less than 52 bits of |
| 170 | +precision in the significand at the smallest representable exponent |
| 171 | +(i.e., -1022). Subnormal numbers present special challenges for both |
| 172 | +the exponent- and logarithm-based mapping function, and to avoid |
| 173 | +additional complexity induced by corner cases, subnormal numbers are |
| 174 | +rounded up to 0x1p-1022 in this implementation. |
| 175 | + |
| 176 | +Handling subnormal numbers is difficult for the logarithm mapping |
| 177 | +function because Golang's `math.Log()` function rounds subnormal |
| 178 | +numbers up to 0x1p-1022. Handling subnormal numbers is difficult for |
| 179 | +the exponent mapping function because Golang's `math.Frexp()`, the |
| 180 | +natural API for extracting a value's base-2 exponent, also rounds |
| 181 | +subnormal numbers up to 0x1p-1022. |
| 182 | + |
| 183 | +While the additional complexity needed to correctly map subnormal |
| 184 | +numbers is small in both cases, there are few real benefits in doing |
| 185 | +so because of the inherent loss of precision. As secondary |
| 186 | +motivation, clamping values to the range [0x1p-1022, math.MaxFloat64] |
| 187 | +increases symmetry. This limit means that minimum bucket index and the |
| 188 | +maximum bucket index have similar magnitude, which helps support |
| 189 | +greater maximum scale. Supporting numbers smaller than 0x1p-1022 |
| 190 | +would mean changing the valid scale interval to [-11,19] compared with |
| 191 | +[-10,20]. |
| 192 | + |
| 193 | +### UpdateByIncr interface |
| 194 | + |
| 195 | +The OpenTelemetry metrics SDK `Aggregator` type supports an `Update()` |
| 196 | +interface which implies updating the histogram by a count of 1. This |
| 197 | +implementation also supports `UpdateByIncr()`, which makes it possible |
| 198 | +to support counting multiple observations in a single API call. This |
| 199 | +extension is useful in applying `Histogram` aggregation to _sampled_ |
| 200 | +metric events (e.g. in the [OpenTelemetry statsd |
| 201 | +receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/statsdreceiver)). |
| 202 | + |
| 203 | +Another use for `UpdateByIncr` is in a Span-to-metrics pipeline |
| 204 | +following [probability sampling in OpenTelemetry tracing |
| 205 | +(WIP)](https://github.com/open-telemetry/opentelemetry-specification/pull/2047). |
| 206 | + |
| 207 | +## Acknowledgements |
| 208 | + |
| 209 | +This implementation is based on work by [Yuke |
| 210 | +Zhuge](https://github.com/yzhuge) and [Otmar |
| 211 | +Ertl](https://github.com/oertl). See |
| 212 | +[NrSketch](https://github.com/newrelic-experimental/newrelic-sketch-java/blob/1ce245713603d61ba3a4510f6df930a5479cd3f6/src/main/java/com/newrelic/nrsketch/indexer/LogIndexer.java) |
| 213 | +and |
| 214 | +[DynaHist](https://github.com/dynatrace-oss/dynahist/blob/9a6003fd0f661a9ef9dfcced0b428a01e303805e/src/main/java/com/dynatrace/dynahist/layout/OpenTelemetryExponentialBucketsLayout.java) |
| 215 | +repositories for more detail. |
0 commit comments