Skip to content

[PyCDE] Physical Design Code Generators and SystemVerilog Integration #3305

Open
@teqdruid

Description

While “legacy” integration is easily dismissed as a “detail”, in practice it is anything but. Getting this right gives designers an easy on-boarding path to a particular technology. Getting it wrong often means that your technology often gets no more consideration than a dismissive “that’s nice, we’ll look into it”. (We’ve learned this the hard way more times than we’d like to admit.) There are two basic pieces of information any advanced code generation tool needs: input configuration (code generation request) and output information that the SystemVerilog needs to know. Ideally, this wouldn’t feel too different from instantiating a parameterized SystemVerilog module. System-level interactions, however, necessarily add an additional level of complexity in design verification.

Problem description

Assembling the code generation request from hand coded SystemVerilog

Requirements

  1. Since code generation tools don’t generally produce parameterized Verilog (including ours), we need to get the list of configurations (module parameterizations) to generate. Since we want to use SystemVerilog parameters, we need to extract the parameter values for each instantiation of the modules which the generation tool will emit.
  2. Our generation tool will also be creating placement data, which is per-instance. As such, we need to have the full instance hierarchy path (from a common root) to all the generated instances.
  3. Additionally, our generator can (read: will) specialize modules on a per-instance basis, so we need the instance hierarchy paths for this requirement as well.

Solution overview

The most bullet-proof way to obtain this information is to fully elaborate the hand coded SystemVerilog, walk the instance hierarchy, and extract instance paths and parameter values for each instance which is to be generated. We can then encode these data into a
standardized code generation request.

Communicating generated output back to hand coded SystemVerilog

Requirements

  1. A key difference between the advanced code generation we use and traditional code generation is that ours gets to decide the internal schedule (which affects the I/O schedule) of a module on a per-instance basis. For instance: if a simple pipeline is requested, our generator is free to choose the pipeline depth. Since the hand coded SystemVerilog needs to know after how many cycles input data becomes valid output data, our generator needs a way to communicate those data back to the hand coded SystemVerilog (without requiring changes).
  2. Since our generator potentially generates different RTL on a per-instance basis, the hand coded SystemVerilog needs to instantiate the correct generated RTL without manual intervention.

Solution overview

Generate SystemVerilog to be `included in the hand coded SystemVerilog to provide this information. Said “include” file will contain multi-level defparams to set instance hierarchy path module parameters on all generated modules. Similarly, it will use defparams to set “output” module parameters (i.e. representing the I/O schedule) which hand-written SystemVerilog can use in adjusting its own schedule.

Implications for design verification

Simulations often test submodules of a design. Our generated code, however, is sensitive to design-global properties. For instance, based on the design congestion connections could have to run a longer distance and thus require a longer pipeline. If, however, we generate just the sub-design under test our generator won’t be able to compute the same estimate the design congestion and could select a different pipeline depth, resulting in a different sub-design being tested.

Solution 1: Elaborate the entire design to get the code generation request for a specific design under test.

Solution 2: Expose all the possible dimensions in which a particular module could be generated and allow the simulation to set them. The test could then select points in the design space to verify. This wouldn’t be a design-representative integration test so much as a unit test.

Note: these “opportunities” are not specific to PyCDE/CIRCT.

Any code generator which does advanced physical design will have these same issues. Thus far, the solutions to each of these problems have been ad-hoc and manual. For instance, major subsystems (customized on a per-instance basis) have an application-specific instance identifier passed down through module parameters manually. Others we haven’t encountered yet. While it is theoretically possible to continue this ad-hoc approach, as we want to generate and specialize more modules/instances and they become more granular, these problems will rapidly become intractable and massively error prone to solve manually.

Proposed solution details

Tool flow

  1. Have the generator output parameterized “dummy” modules for each module it knows how to synthesize. These modules would include all the possible parameters and (importantly) $display statements indicating where in the module hierarchy their instances live and what parameter values they were instantiated with.
  2. Fully compile and elaborate the design (or just from the common root) using a SystemVerilog compiler (e.g. Questa or Quartus). If a simulator, run it for a single tick to get the $display outputs.
  3. Parse the log for the relevant $display outputs. They contain the relevant instance hierarchy information (addressing [2 and 3]) and the parameters being instantiated (addressing [1]). Better yet, use an API to explore the hierarchy programmatically. Feed that information to the generator as a code generation request config file.
  4. The generator then emits the following collateral:
    a. The specialized, plain (non-parameterized) SystemVerilog modules based on the requested configurations.
    b. A Quartus (or other synthesis engine) tcl file to be included in the Quartus image builds. Said file would contain the full instance paths.
    c. A set of SystemVerilog “dispatch” wrapper modules. They’d instantiate the correct generated, specialized module depending on where it is in the instance hierarchy, which is indicated by a module parameter. Additionally, the per-instance I/O schedule (if applicable) would be given by a set of “output” module parameters.
    d. An include file with a set of SystemVerilog cross-module defparams. One set of defparams per module instance setting the instance hierarchy parameter (addressing [5]) and all the per-instance I/O schedule “output” parameters (addressing [4]).
  5. If one is not building or simulating the entire design, tell the generator which instance is being simulated/built. The generator would then re-root its output with the instance as the root.

Required SystemVerilog modifications

The hand-written SystemVerilog modifications would be minimal. At each “root” instantiation site (i.e. the “role” instantiation and the “DUT” instantiation sites), designers would have to add a `include for the defparams file which has been generated.

Working sketch of solution

Hand-written RTL

module Top (
  input clk
);

  Child #(.NUM_DST(2), .WIDTH(8)) child [1:0] (
    .clk(clk)
  );

  generate
    genvar i;
    for (i=0; i<1; i++)
    begin : genchildren
      Child #(
        .NUM_DST(3),
        .WIDTH(32)
      ) genchild (
        .clk(clk)
      );
    end
  endgenerate
  
  Child #(.NUM_DST(3), .WIDTH(8)) bar (
    .clk(clk)
  );
  Downstream #(.OFFSET(bar.PIPE_DEPTH)) downstream ();

  `include "circt_top_defparams.inc.sv"
endmodule

module Downstream #(
  parameter int OFFSET
) ();
  initial $display("OFFSET: %d", OFFSET);
endmodule

“Generate request” creation phase

Generated code

module Child # (
  // Normal input parameters.
  parameter int NUM_DST,
  parameter int WIDTH,

  // "Output" parameters. Defaults to something reasonable, but will be
  // incorrect in initial elaboration "CIRCT REQUEST" phase.
  parameter int PIPE_DEPTH = 1
) ( );

  initial begin
    $display("CIRCT_REQ: \"Child\", %m, NUM_DST: %d, WIDTH: %d", NUM_DST, WIDTH);
  end

endmodule

Questasim session

$ vlog -sv configtest_req_phase.sv && vsim -c Top -suppress vopt-2912 -do "run 1; exit"
QuestaSim-64 vlog 2021.4 Compiler 2021.10 Oct 13 2021
Start time: 19:18:25 on May 16,2022
vlog -sv configtest_req_phase.sv
-- Compiling module Top
-- Compiling module Downstream
-- Compiling module Child

Top level modules:
        Top
End time: 19:18:25 on May 16,2022, Elapsed time: 0:00:00
Errors: 0, Warnings: 0
Reading pref.tcl

# 2021.4

# vsim -c Top -suppress vopt-2912 -do "run 1; exit"
# Start time: 19:18:26 on May 16,2022
# ** Note: (vsim-8009) Loading existing optimized design _opt
# Loading sv_std.std
# Loading work.Top(fast)
# Loading work.Child(fast__1)
# Loading work.Child(fast__2)
# Loading work.Child(fast)
# run 1
# CIRCT_REQ: "Child", Top.child[0], NUM_DST:           2, WIDTH:           8
# CIRCT_REQ: "Child", Top.child[1], NUM_DST:           2, WIDTH:           8
# CIRCT_REQ: "Child", Top.genchildren[0].genchild, NUM_DST:           3, WIDTH:          32
# CIRCT_REQ: "Child", Top.bar, NUM_DST:           3, WIDTH:           8
# OFFSET:           1
#  exit
# End time: 19:18:26 on May 16,2022, Elapsed time: 0:00:00
# Errors: 0, Warnings: 0

Synthesis / simulation phase

Although not shown, this synthesis code works in Quartus as well.

Generated code: child.sv

module Child # (
  // Normal input parameters.
  parameter int NUM_DST,
  parameter int WIDTH,

  // Instance hierarchy location.
  parameter string __INST_HIER = "invalid",
  // "Output" parameters. Now defaults to an "invalid" value, but gets set by
  // 'defparam'.
  parameter int PIPE_DEPTH = -1
) (
  input clk
);

  generate
  begin
    case (__INST_HIER)
    "top.child[0]":
      Child_dst2_width8_pipedepth3 impl(.clk(clk));
    "top.child[1]":
      Child_dst2_width8_pipedepth4 impl(.clk(clk));
    "top.genchild[0]":
      Child_dst3_width32_pipedepth4 impl(.clk(clk));
    "top.bar":
      Child_dst3_width32_pipedepth4 impl(.clk(clk));
    default:
      $fatal(1, "%m: Could not find specialized module for %s", __INST_HIER);
    endcase
  end
  endgenerate
endmodule

module Child_dst2_width8_pipedepth3 (
  input clk
);
  initial begin
    $display("inst %m module Child_dst2_width8_pipedepth3");
  end
endmodule

module Child_dst2_width8_pipedepth4 (
  input clk
);
  initial begin
    $display("inst %m module Child_dst2_width8_pipedepth4");
  end
endmodule

module Child_dst3_width32_pipedepth4 (
  input clk
);
  initial begin
    $display("inst %m module Child_dst3_width32_pipedepth4");
  end
endmodule

Generated code: circt_top_defparams.inc.sv

defparam child[0].__INST_HIER = "top.child[0]";
defparam child[0].PIPE_DEPTH = 3;
defparam child[1].__INST_HIER = "top.child[1]";
defparam child[1].PIPE_DEPTH = 4;

defparam genchildren[0].genchild.__INST_HIER = "top.genchild[0]";
defparam genchildren[0].genchild.PIPE_DEPTH = 4;

defparam bar.__INST_HIER = "top.bar";
defparam bar.PIPE_DEPTH = 4;

Questasim session

$ vlog -sv configtest_synth.sv && vsim -c Top -do "run 1; exit"
QuestaSim-64 vlog 2021.4 Compiler 2021.10 Oct 13 2021
Start time: 19:23:00 on May 16,2022
vlog -sv configtest_synth.sv
-- Compiling module Top
-- Compiling module Downstream
-- Compiling module Child
-- Compiling module Child_dst2_width8_pipedepth3
-- Compiling module Child_dst2_width8_pipedepth4
-- Compiling module Child_dst3_width32_pipedepth4

Top level modules:
        Top
End time: 19:23:00 on May 16,2022, Elapsed time: 0:00:00
Errors: 0, Warnings: 0
Reading pref.tcl

# 2021.4

# vsim -c Top -do "run 1; exit"
# Start time: 19:23:01 on May 16,2022
# ** Note: (vsim-8009) Loading existing optimized design _opt1
# Loading sv_std.std
# Loading work.Top(fast)
# run 1
# inst Top.child[1].genblk1.genblk1.impl module Child_dst2_width8_pipedepth4
# inst Top.child[0].genblk1.genblk1.impl module Child_dst2_width8_pipedepth3
# inst Top.genchildren[0].genchild.genblk1.genblk1.impl module Child_dst3_width32_pipedepth4
# inst Top.bar.genblk1.genblk1.impl module Child_dst3_width32_pipedepth4
# OFFSET:           4
#  exit
# End time: 19:23:02 on May 16,2022, Elapsed time: 0:00:01
# Errors: 0, Warnings: 0

Notes

  1. This is an internal document I wrote a few weeks ago and has been sanitized for external consumption. I'm posting it here so everyone gets a feeling for my needs. Since this is obviously a non-trivial amount of work, we've (Microsoft) made the decision to only implement pieces as necessary.
  2. This proposal is not foolproof! In particular, it doesn't deal with the case where an "output" parameter is used to calculate a parameter for a generated module. We could catch that case by checking the parameters against what we are instantiating in the dispatch module.
  3. It also doesn't support extern modules which instantiate generated modules.

Acknowledgements

Aaron Landy (Microsoft)
Todd Massengill (Microsoft)
Jinhang Choi (Microsoft)
Gregg Baeckler (Intel)

Metadata

Assignees

No one assigned

    Labels

    PyCDEPython CIRCT Design Entry API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions