Skip to content

Commit 36044cb

Browse files
docs
1 parent e132d04 commit 36044cb

2 files changed

Lines changed: 14 additions & 79 deletions

File tree

float/README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,9 @@ The module fails synthesis if the supplied value disagrees with its real stage c
4646
the latency cannot slip through unnoticed -- the build breaks and points you at the stale constant.
4747
Pair `LATENCY` with `zkf_pipe` to delay your own control or sideband signals so they land with the operator's output.
4848

49+
The `LATENCY` value is a sum of some constant baseline number of stages,
50+
plus optionally some WMAN-dependent stage count, plus the sum of all `STAGE_*` values (all zero by default).
51+
4952
### Catalogue
5053

5154
Notation: ⇝ - combinational, ⇻ - sequential, (nothing) - can be either depending on the selected `STAGE_`s.
@@ -75,8 +78,8 @@ Notation: ⇝ - combinational, ⇻ - sequential, (nothing) - can be either depen
7578

7679
The following modules are expected to appear because they are the missing primitives needed to access a huge variety
7780
of transcendental and trigonometric functions:
78-
`zkf_divmod`, `zkf_sincos` (maybe `zkf_sincos_phase(phi)` for some fixed-point phase modulo 1), `zkf_atan2`.
79-
Modulo-pi range reduction is needed for basic trig operators and is provided by divmod.
81+
`zkf_sincos` (maybe `zkf_sincos_phase(phi)` for some fixed-point phase modulo 1), `zkf_atan2`.
82+
Also, modulo-pi range reduction is needed for basic trig operators.
8083
From these we get:
8184

8285
exp(x) = exp2(x * log2(e))
@@ -91,6 +94,10 @@ From these we get:
9194

9295
And so on.
9396

97+
Generic floating-point remainder/modulo computation is not included because the general solution requires iterative
98+
range reduction which maps poorly onto fixed-latency FPGA cores; instead, one can build the iterative solver using
99+
the existing basic operators.
100+
94101
## Semantics
95102

96103
Differences from IEEE 754: no NaN, no subnormals (exponent 0 always encodes +0; finite magnitudes in `(0, min_normal/2)`

float/zubax_kulibin_float.md

Lines changed: 5 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -404,78 +404,7 @@ Performance target: at least two quotient bits per cycle.
404404

405405
---
406406

407-
## 8. Divider With Residual Remainder
408-
409-
Combined quotient/residual divider, streamed, zero-bubble:
410-
411-
```verilog
412-
zkf_divrem #(parameter int WEXP = 6, parameter int WMAN = 18)(
413-
input wire clk,
414-
input wire rst,
415-
416-
input wire in_valid,
417-
input wire [WFULL-1:0] a,
418-
input wire [WFULL-1:0] b,
419-
420-
output wire out_valid,
421-
output wire [WFULL-1:0] q,
422-
output wire [WFULL-1:0] r,
423-
output wire div0
424-
);
425-
```
426-
427-
The `q` and `div0` outputs are bit-for-bit identical to `zkf_div` with the same parameters and inputs.
428-
429-
Residual semantics:
430-
431-
```text
432-
This is a division residual, not C fmod and not IEEE remainder.
433-
434-
if a == 0:
435-
r = +0
436-
437-
else if b == 0:
438-
r = +0
439-
440-
else if q is infinity:
441-
r = +0
442-
443-
else:
444-
r = pack(a - b * q)
445-
```
446-
447-
The residual expression above uses the decoded, rounded value of output `q` and is evaluated using the same
448-
deterministic no-NaN infinity arithmetic as the rest of this format. Notable consequences:
449-
450-
```text
451-
finite / infinity:
452-
q = +0
453-
r = canonicalized a
454-
455-
infinity / infinity:
456-
q = +0
457-
r = signed infinity with sign = sign(a)
458-
459-
infinity / finite nonzero:
460-
q = signed infinity with sign = sign(a) XOR sign(b)
461-
r = +0
462-
```
463-
464-
Implementation guidance:
465-
466-
```text
467-
Share the quotient generation path with zkf_div.
468-
Use the final partial remainder instead of directly evaluating a - b * q with a separate multiplier.
469-
After quotient rounding, adjust the residual if the quotient was incremented.
470-
Pack the residual alongside the quotient so both outputs are aligned under out_valid.
471-
```
472-
473-
Reusable logic shared by `zkf_div` and `zkf_divrem` should be extracted into nonpublic, underscore-prefixed helper
474-
modules, consistent with the internal helper module convention above.
475-
476-
---
477-
478-
## 9. Cast From Signed Integer
407+
## 8. Cast From Signed Integer
479408

480409
```verilog
481410
zkf_from_int #(
@@ -513,7 +442,7 @@ zero input maps to canonical +0
513442

514443
---
515444

516-
## 10. Cast To Signed Integer
445+
## 9. Cast To Signed Integer
517446

518447
```verilog
519448
zkf_to_int #(
@@ -556,7 +485,7 @@ Zero maps to integer zero.
556485

557486
---
558487

559-
## 11. Cast Between Two Format Sizes
488+
## 10. Cast Between Two Format Sizes
560489

561490
```verilog
562491
zkf_resize #(
@@ -597,7 +526,7 @@ target overflow maps to signed infinity
597526

598527
---
599528

600-
## 12. Sqrt/log2/exp2, integer detection
529+
## 11. Sqrt/log2/exp2, integer detection
601530

602531
Specifically `zkf_log2` and `zkf_exp2` can be used later to build arbitrary log/exp.
603532

@@ -670,7 +599,7 @@ module zkf_exp2 #(parameter WEXP = 6, parameter WMAN = 18,
670599

671600
---
672601

673-
## 13. Compare and Sort
602+
## 12. Compare and Sort
674603

675604
Registered floating-point comparison and min/max sort. Comparison requires canonicalization
676605
(exponent-zero inputs are treated as +0, exponent-all-ones inputs as signed infinity, fraction ignored for both classes),
@@ -749,7 +678,6 @@ division by zero asserts div0
749678
add/sub module implements both operations exactly per spec
750679
mul uses the same pack semantics as add/sub
751680
div quotient matches exact a/b rounded per spec
752-
div residual matches the documented a - b*q rule rounded per spec
753681
resize equals decode-then-pack into target format
754682
```
755683

0 commit comments

Comments
 (0)