|
1 |
| -<!--- |
| 1 | +# How it works |
2 | 2 |
|
3 |
| -This file is used to generate your project datasheet. Please fill in the information below and delete any unused |
4 |
| -sections. |
| 3 | +## Overview |
5 | 4 |
|
6 |
| -You can also include images in this folder and reference them in the markdown. Each image must be less than |
7 |
| -512 kb in size, and the combined size of all images must be less than 1 MB. |
8 |
| ---> |
9 |
| - |
10 |
| -## How it works |
11 |
| - |
12 |
| -This project showcases tiny_pll, a completely self-contained fractional-N |
| 5 | +This project showcases `tiny_pll`, a completely self-contained fractional-N |
13 | 6 | frequency synthesizer using less than 6% of the area of a 1x1 TinyTapeout tile.
|
14 |
| -There are 4 tiny_pll instances in this project. Each instance multiplies the |
| 7 | +The design goals of this project were as follows: |
| 8 | +1. The design should be as simple as possible to reduce the chance of failure. |
| 9 | +2. The design should be as small as possible so it can be incorporated into |
| 10 | +future Tiny Tapeout designs with minimal area overhead. |
| 11 | + |
| 12 | +There are 4 `tiny_pll` instances in this project. Each instance multiplies the |
15 | 13 | frequency of a reference clock by a rational number A/B, where A and B can be
|
16 | 14 | between 1 and 15. Such a block has two main use cases:
|
17 | 15 | 1. Generating several internal clocks from a single off-chip oscillator (e.g.,
|
18 | 16 | for a large digital design with multiple clock domains)
|
19 | 17 | 2. Generating one or more internal clocks at a higher frequency than what can be
|
20 | 18 | provided to the tile through the mux and GPIO pins
|
21 | 19 |
|
22 |
| -tiny_pll is designed for a 10 MHz reference input, which implies an output |
| 20 | +`tiny_pll` is designed for a 10 MHz reference input, which implies an output |
23 | 21 | frequency between 67 kHz and 150 MHz. The 4 output clocks are connected to the
|
24 |
| -GPIO pins uo[3:0]. In reality, the maximum output frequency is limited by 4 |
| 22 | +GPIO pins `uo[3:0]`. In reality, the maximum output frequency is limited by 4 |
25 | 23 | factors:
|
26 | 24 | 1. The speed of the Caravel I/O cells, which itself is a factor of the off-chip
|
27 | 25 | load capacitance
|
28 | 26 | 2. The routing between the TT mux and the I/O cells
|
29 |
| -3. The speed of the TT mux itself |
| 27 | +3. The speed of the TT mux |
30 | 28 | 4. The routing between the project tile and the TT mux
|
| 29 | +The minimum output frequency is limited to roughly 1 MHz due to the minimum |
| 30 | +speed of the VCO. |
| 31 | + |
| 32 | +A 1-bit delta-sigma ADC is included to allow measurement of the analog control |
| 33 | +voltage on `uo[4]`. |
| 34 | + |
| 35 | +This design is inherently mixed-signal due to the analog nature of the PLL. |
| 36 | +Consequently, the top-level layout is implemented as a custom analog/digital |
| 37 | +section for the PLL and ADC, surrounded by RTL which implements the |
| 38 | +control/status registers (CSRs) and various clock buffering and multiplexing |
| 39 | +functions. Schematics were created using `xschem` and simulated with `ngspice`; |
| 40 | +custom layout was done using `klayout` with the Efabless `sky130` PDK; digital |
| 41 | +synthesis and PnR was done using a custom OpenROAD flow; and `magic` and |
| 42 | +`netgen` were used for LVS, DRC and parasitic extraction. |
| 43 | + |
| 44 | +## PLL |
| 45 | + |
| 46 | +The top-level schematic of `tiny_pll` is shown below: |
| 47 | + |
| 48 | + |
| 49 | +The PLL uses a standard fractional-N architecture, where an input and output |
| 50 | +frequency divider are used to set the frequency multiplication with respect to |
| 51 | +the reference clock input. The output frequency is `A/B * f_ref`, where `A` is |
| 52 | +the division ratio of `XDIV_FB`, `B` is the division ratio of `XDIV_OUT` and |
| 53 | +`f_ref` is the input clock frequency. Documentation for the PLL subcells is |
| 54 | +included below. |
| 55 | + |
| 56 | +Throughout the schematics, the pins `VPB` and `VNB` are included to connect the |
| 57 | +bulk terminals of all PMOS and NMOS devices, respectively. This is done to |
| 58 | +ensure the corresponding terminals of the standard cell instances at each level |
| 59 | +of hierarchy are propagated to the top level and connected to VPWR and VGND. |
| 60 | + |
| 61 | +### Divider |
| 62 | + |
| 63 | + |
| 64 | + |
| 65 | +Frequency dividers are implemented using a 4-bit binary counter followed by 4 |
| 66 | +XOR gates to check for equality with a division ratio input `lmt[3..0]`. When |
| 67 | +the counter output is equal to `lmt`, `div_rstb` is immediately asserted, which |
| 68 | +resets the counter to 0 at the rising edge of `clk_in`. As a result, the maximum |
| 69 | +division ratio from `clk_in` to `eq` is 15, when `lmt == 4'b1111`. |
| 70 | + |
| 71 | +Since the counter is reset as soon as its output is equal to the division ratio, |
| 72 | +a very short pulse is produced at the `eq` node, with a duration equal to the |
| 73 | +propagation delay of the counter. This could potentially be a timing concern for |
| 74 | +`XDF`, but since the counter delay is at least 3 gate delays, the flip-flop was |
| 75 | +observed to operate as intended across process, voltage and temperature (PVT) in |
| 76 | +simulation. |
| 77 | + |
| 78 | +The D flip-flop (DFF) at the output is included to ensure an output duty cycle |
| 79 | +close to 50%. As a result, the actual output frequency is `f_ref / (2*lmt)`, |
| 80 | +which implies a division ratio from `clk_in` to `clk_out` between 2 and 30. |
| 81 | + |
| 82 | +The tie cell `sky130_fd_sc_hd__conb_1` is used when gates must be connected to |
| 83 | +VPWR or VGND to avoid potential ESD issues. |
| 84 | + |
| 85 | +### Phase-frequency detector (PFD) |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | +The PFD is composed of two DFFs, clocked by the divided VCO output and the |
| 90 | +reference input, respectively. Since the input of both DFFs is tied to 1, each |
| 91 | +DFF can be implemented using two S-R latches, each of which uses two `nor2` |
| 92 | +gates. The full PFD thus uses 8 `nor2` gates, one `nand2` and one `inv_1`, which |
| 93 | +is considerably smaller than using discrete DFF standard cells with the D inputs |
| 94 | +tied to VPWR. |
| 95 | + |
| 96 | +A NAND followed by an inverter is used instead of a single AND to slightly |
| 97 | +increase the minimum output pulse width and avoid charge pump glitches. |
| 98 | + |
| 99 | +### Charge pump |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +The charge pump uses two current sources (`MNSRC` and `MPSRC`), which can be |
| 104 | +interchangeably switched to the output with the `up` and `down` inputs. The |
| 105 | +charge pump current is nominally 1 uA and is set by the bias generator. The |
| 106 | +switches use nearly minimum width to reduce area, and minimum length to reduce |
| 107 | +capacitance. The PMOS switch uses 2x the W/L of the NMOS switch to ensure |
| 108 | +roughly equal drain-source saturation voltages (VDSAT). |
| 109 | + |
| 110 | +### Loop filter |
| 111 | + |
| 112 | + |
| 113 | + |
| 114 | +The loop filter is implemented using a series R/C combination to compensate the |
| 115 | +loop transfer function such that a zero is placed below the crossover frequency |
| 116 | +to ensure stability, and a pole is placed above the crossover frequency to |
| 117 | +ensure fast settling time. A second capacitor `XC2` is included to reduce ripple |
| 118 | +in the control voltage, which in turn reduces phase noise at the PLL output. |
| 119 | +Component values were selected using a linearized model developed using |
| 120 | +schematic-only simulations of the VCO to determine the voltage-to-frequency |
| 121 | +gain. The loop bandwidth was chosen to be on the order of 100 kHz, with a phase |
| 122 | +margin of 65 degrees at an output frequency of 10 MHz. The resulting R/C values |
| 123 | +are `R = 100 kOhm` and `C1 = 1 pF`. |
| 124 | + |
| 125 | +In reality, the loop characteristics vary significantly across output frequency |
| 126 | +due to the nonlinear gain of the VCO, which was observed to have a nearly |
| 127 | +exponential voltage-to-frequency characteristic in simulation. This is likely |
| 128 | +due to the VCO current sources operating in the subthreshold region, where the |
| 129 | +ID/VGS characteristic is near-exponential. |
| 130 | + |
| 131 | +The loop filter resistor is implemented using the `urpm` high-resistance poly |
| 132 | +implant, which is roughly 2 kOhm/square. While e-test values are not provided |
| 133 | +for this resistor in `sky130`, the value is not critical, and significant |
| 134 | +variations (+/-50%) were observed to result in a stable loop in simulation. |
| 135 | + |
| 136 | +The loop filter capacitors are implemented using NMOS devices with drain and |
| 137 | +source shorted to VGND. This is due to the significantly higher capacitance |
| 138 | +density of MOS devices relative to MIM capacitors (~8 vs ~2 fF/um^2). The MOS |
| 139 | +capacitance is highly nonlinear and increases at high control voltages due to |
| 140 | +the inversion charge, but again the capacitor value is not critical and this |
| 141 | +nonlinearity does not cause instability in the feedback loop. |
| 142 | + |
| 143 | +The loop filter consumes nearly 50% of the area of the PLL. Various methods were |
| 144 | +explored to reduce loop filter area, including: |
| 145 | +1. MIM capacitors could be used and placed on top of the other circuit blocks to |
| 146 | +reduce area |
| 147 | +2. A capacitance multiplier could be used to allow a smaller intrinsic |
| 148 | +capacitance |
| 149 | + |
| 150 | +The MIM capacitor method is possible, but there is some ambiguity in the |
| 151 | +`sky130` design rules as to whether a MIM capacitor can be placed over `met1` |
| 152 | +and the base layers (see `capm.10` in the [sky130 periphery |
| 153 | +rules](https://skywater-pdk.readthedocs.io/en/main/rules/periphery.html#capm). |
| 154 | +Additionally, this could result in unwanted noise from the digital blocks |
| 155 | +coupling into the capacitors, which could degrade phase noise performance. |
| 156 | +Further, the capacitors would have to be divided up to lie between the power |
| 157 | +rails on `met4` which would increase their area. |
| 158 | + |
| 159 | +A capacitance multiplier was implemented using a 100 fF capacitor with a 10:1 |
| 160 | +multiplication ratio, but the final layout was the same size as the MOS |
| 161 | +capacitor implementation and was thus exlcuded from the final design. The |
| 162 | +capacitance multiplier was additionally seen to have poor high-frequency |
| 163 | +response compared to a MOS or MIM capacitor, which resulted in unacceptably high |
| 164 | +control voltage ripple. |
| 165 | + |
| 166 | +### Voltage-controlled oscillator (VCO) |
| 167 | + |
| 168 | + |
| 169 | + |
| 170 | +The VCO is a 3-stage current-starved ring oscillator using standard cell |
| 171 | +inverters. The current sources are minimum-length to maximize W/L, which in turn |
| 172 | +minimizes VDSAT, and minimize capacitance. The output resistance of these |
| 173 | +current sources is irrelevant since it only matters that the oscillator current |
| 174 | +is limited, and not the particular limit value. A triode device `MNCTL` is used |
| 175 | +to control the source/sink current of the VCO. LVT NMOS devices are used to |
| 176 | +ensure the operating control voltage is somewhere near half supply at an output |
| 177 | +frequency of 10 MHz, which helps ensure the maximum output frequency can be met |
| 178 | +across process variations. Four "keeper" devices (`MNEN1`, `MNEN2`, `MNEN3` and |
| 179 | +`MPEN`) are included to disable the circuit with zero static power consumption. |
31 | 180 |
|
32 |
| -## How to test |
| 181 | +### Bias generator |
33 | 182 |
|
34 |
| -TBU |
| 183 | + |
35 | 184 |
|
36 |
| -## External hardware |
| 185 | +The bias generator is a self-biased current mirror, which provides a roughly |
| 186 | +supply-independent current for the charge pump. The exact current is highly |
| 187 | +dependent on the poly resistor `XRES`, but is designed to be nominally 1 uA at |
| 188 | +25 degrees C. A startup circuit is included to ensure the bias generator does |
| 189 | +not fall into an undesirable operating point where `IOUT = 0`. The diode devices |
| 190 | +`MPSU1` and `MPSU2` charge the `kick` node to VPWR when the circuit is enabled, |
| 191 | +which pulls `bias_p` low and establishes a current in the mirror devices. Once |
| 192 | +the mirror is active, `MNSU1` pulls `kick` low and disables the startup circuit. |
| 193 | +Multiple "keeper" devices are included to disable the circuit with zero static |
| 194 | +power consumption. |
37 | 195 |
|
38 |
| -Oscilloscope |
|
0 commit comments