Skip to content

Downsize move #7410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions src/rsz/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ set_wire_rc
| Switch Name | Description |
| ----- | ----- |
| `-clock` | Enable setting of RC for clock nets. |
| `-signal` | Enable setting of RC for signal nets. |
| `-signal` | Enable setting of RC for signal nets. |
| `-layers` | Use the LEF technology resistance and area/edge capacitance values for the layers. The values for each layers will be used for wires with the prefered layer direction, if 2 or more layers have the same prefered direction the avarege value is used for wires with that direction. This is used for a default width wire on the layer. |
| `-layer` | Use the LEF technology resistance and area/edge capacitance values for the layer. This is used for a default width wire on the layer. |
| `-resistance` | Resistance per unit length, units are from the first Liberty file read. |
Expand All @@ -70,7 +70,6 @@ set_wire_rc
| `-v_resistance` | Resistance per unit length for vertical wires, units are from the first Liberty file read. |
| `-v_capacitance` | Capacitance per unit length for vertical wires, units are from the first Liberty file read. |


### Set Layer RC

The `set_layer_rc` command can be used to set the resistance and capacitance
Expand Down Expand Up @@ -110,7 +109,7 @@ After the `global_route` command has been called, the global routing topology
and layers can be used to estimate parasitics with the `-global_routing`
flag.

The optional argument `-spef_file` can be used to write the estimated parasitics using
The optional argument `-spef_file` can be used to write the estimated parasitics using
Standard Parasitic Exchange Format.

```tcl
Expand Down Expand Up @@ -249,7 +248,7 @@ the wire. It also resizes gates to normalize slews. Use `estimate_parasitics
-placement` before `repair_design` to estimate parasitics considered
during repair. Placement-based parasitics cannot accurately predict
routed parasitics, so a margin can be used to "over-repair" the design
to compensate.
to compensate.

```tcl
repair_design
Expand Down Expand Up @@ -318,7 +317,7 @@ Setup repair is done before hold repair so that hold repair does not
cause setup checks to fail.

The worst setup path is always repaired. Next, violating paths to
endpoints are repaired to reduced the total negative slack.
endpoints are repaired to reduced the total negative slack.

```tcl
repair_timing
Expand All @@ -333,6 +332,7 @@ repair_timing
[-sequence]
[-skip_pin_swap]
[-skip_gate_cloning]
[-skip_size_down]
[-skip_buffering]
[-skip_buffer_removal]
[-skip_last_gasp]
Expand All @@ -355,9 +355,10 @@ repair_timing
| `-setup_margin` | Add additional setup slack margin. |
| `-hold_margin` | Add additional hold slack margin. |
| `-allow_setup_violations` | While repairing hold violations, buffers are not inserted that will cause setup violations unless `-allow_setup_violations` is specified. |
| `-sequence` | Specify a particular order of setup timing optimizations. The default is "unbuffer,buffer,swap,size,clone,split". Ignores skip flags when used. |
| `-sequence` | Specify a particular order of setup timing optimizations. The default is "unbuffer,sizeup,swap,buffer,clone,split". Obeys skip flags also. |
| `-skip_pin_swap` | Flag to skip pin swap. The default is to perform pin swap transform during setup fixing. |
| `-skip_gate_cloning` | Flag to skip gate cloning. The default is to perform gate cloning transform during setup fixing. |
| `-skip_size_down` | Flag to skip gate down sizing. The default is to perform non-critical fanout gate down sizing transform during setup fixing. |
| `-skip_buffering` | Flag to skip rebuffering and load splitting. The default is to perform rebuffering and load splitting transforms during setup fixing. |
| `-skip_buffer_removal` | Flag to skip buffer removal. The default is to perform buffer removal transform during setup fixing. |
| `-skip_last_gasp` | Flag to skip final ("last gasp") optimizations. The default is to perform greedy sizing at the end of optimization. |
Expand All @@ -370,7 +371,7 @@ repair_timing

Use`-recover_power` to specify the percent of paths with positive slack which
will be considered for gate resizing to save power. It is recommended that
this option be used with global routing based parasitics.
this option be used with global routing based parasitics.

#### Instance Name Prefixes

Expand Down Expand Up @@ -607,9 +608,9 @@ report_wns

## Regression tests

There are a set of regression tests in `./test`. For more information, refer to this [section](../../README.md#regression-tests).
There are a set of regression tests in `./test`. For more information, refer to this [section](../../README.md#regression-tests).

Simply run the following script:
Simply run the following script:

```shell
./test/regression
Expand Down
14 changes: 11 additions & 3 deletions src/rsz/include/rsz/Resizer.hh
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,8 @@ class ResizerObserver;
class CloneMove;
class BufferMove;
class SplitLoadMove;
class SizeMove;
class SizeDownMove;
class SizeUpMove;
class SwapPinsMove;
class UnbufferMove;

Expand Down Expand Up @@ -153,6 +154,8 @@ enum class MoveType
UNBUFFER,
SWAP,
SIZE,
SIZEUP,
SIZEDOWN,
CLONE,
SPLIT
};
Expand Down Expand Up @@ -279,6 +282,7 @@ class Resizer : public dbStaState, public dbNetworkObserver
const std::vector<MoveType>& sequence,
bool skip_pin_swap,
bool skip_gate_cloning,
bool skip_size_down,
bool skip_buffering,
bool skip_buffer_removal,
bool skip_last_gasp);
Expand Down Expand Up @@ -810,9 +814,12 @@ class Resizer : public dbStaState, public dbNetworkObserver
CloneMove* clone_move = nullptr;
SplitLoadMove* split_load_move = nullptr;
BufferMove* buffer_move = nullptr;
SizeMove* size_move = nullptr;
SizeDownMove* size_down_move = nullptr;
SizeUpMove* size_up_move = nullptr;
SwapPinsMove* swap_pins_move = nullptr;
UnbufferMove* unbuffer_move = nullptr;
int accepted_move_count_ = 0;
int rejected_move_count_ = 0;

friend class BufferedNet;
friend class GateCloner;
Expand All @@ -824,7 +831,8 @@ class Resizer : public dbStaState, public dbNetworkObserver
friend class SteinerTree;
friend class BaseMove;
friend class BufferMove;
friend class SizeMove;
friend class SizeDownMove;
friend class SizeUpMove;
friend class SplitLoadMove;
friend class CloneMove;
friend class SwapPinsMove;
Expand Down
134 changes: 125 additions & 9 deletions src/rsz/src/BaseMove.cc
Original file line number Diff line number Diff line change
Expand Up @@ -56,29 +56,32 @@ BaseMove::BaseMove(Resizer* resizer)
dbu_ = resizer_->dbu_;
opendp_ = resizer_->opendp_;

all_count_ = 0;
accepted_count_ = 0;
rejected_count_ = 0;
all_inst_set_ = InstanceSet(db_network_);
pending_count_ = 0;
pending_inst_set_ = InstanceSet(db_network_);
}

void BaseMove::commitMoves()
{
all_count_ += pending_count_;
accepted_count_ += pending_count_;
pending_count_ = 0;
pending_inst_set_.clear();
}

void BaseMove::init()
{
pending_count_ = 0;
all_count_ = 0;
rejected_count_ = 0;
accepted_count_ = 0;
pending_inst_set_.clear();
all_inst_set_.clear();
}

void BaseMove::undoMoves()
{
rejected_count_ += pending_count_;
pending_count_ = 0;
pending_inst_set_.clear();
}
Expand All @@ -100,12 +103,17 @@ int BaseMove::numPendingMoves() const

int BaseMove::numCommittedMoves() const
{
return all_count_;
return accepted_count_;
}

int BaseMove::numRejectedMoves() const
{
return rejected_count_;
}

int BaseMove::numMoves() const
{
return all_count_ + pending_count_;
return accepted_count_ + pending_count_;
}

void BaseMove::addMove(Instance* inst, int count)
Expand Down Expand Up @@ -421,8 +429,8 @@ bool BaseMove::estimatedSlackOK(const SlackEstimatorParams& params)

// Check if degraded delay & slew can be absorbed by driver pin fanouts
Net* output_net = network_->net(params.driver_pin);
NetConnectedPinIterator* pin_iter
= network_->connectedPinIterator(output_net);
auto pin_iter = std::unique_ptr<NetConnectedPinIterator>(
network_->connectedPinIterator(output_net));
while (pin_iter->hasNext()) {
const Pin* pin = pin_iter->next();
if (pin == params.driver_pin) {
Expand Down Expand Up @@ -456,7 +464,8 @@ bool BaseMove::estimatedSlackOK(const SlackEstimatorParams& params)
// Check side fanout paths. Side fanout paths get no delay benefit from
// buffer removal.
Net* input_net = network_->net(params.prev_driver_pin);
pin_iter = network_->connectedPinIterator(input_net);
pin_iter = std::unique_ptr<NetConnectedPinIterator>(
network_->connectedPinIterator(input_net));
while (pin_iter->hasNext()) {
const Pin* side_input_pin = pin_iter->next();
if (side_input_pin == params.prev_driver_pin
Expand Down Expand Up @@ -510,7 +519,8 @@ bool BaseMove::estimateInputSlewImpact(Instance* instance,
bool accept_if_slack_improves)
{
GraphDelayCalc* dcalc = sta_->graphDelayCalc();
InstancePinIterator* pin_iter = network_->pinIterator(instance);
auto pin_iter
= std::unique_ptr<InstancePinIterator>(network_->pinIterator(instance));
while (pin_iter->hasNext()) {
const Pin* pin = pin_iter->next();
if (!network_->direction(pin)->isOutput()) {
Expand Down Expand Up @@ -598,6 +608,112 @@ int BaseMove::fanout(Vertex* vertex)
return fanout;
}

LibertyCell* BaseMove::upsizeCell(LibertyPort* in_port,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the name confusing. I would expect upsizeCell to actually change the cell. Perhaps findMinDelayEquivalentCell ?

LibertyPort* drvr_port,
const float load_cap,
const float prev_drive,
const DcalcAnalysisPt* dcalc_ap)
{
const int lib_ap = dcalc_ap->libertyIndex();
LibertyCell* cell = drvr_port->libertyCell();
LibertyCellSeq swappable_cells = resizer_->getSwappableCells(cell);
if (!swappable_cells.empty()) {
const char* in_port_name = in_port->name();
const char* drvr_port_name = drvr_port->name();
sort(swappable_cells,
[=](const LibertyCell* cell1, const LibertyCell* cell2) {
LibertyPort* port1
= cell1->findLibertyPort(drvr_port_name)->cornerPort(lib_ap);
LibertyPort* port2
= cell2->findLibertyPort(drvr_port_name)->cornerPort(lib_ap);
const float drive1 = port1->driveResistance();
const float drive2 = port2->driveResistance();
const ArcDelay intrinsic1 = port1->intrinsicDelay(this);
const ArcDelay intrinsic2 = port2->intrinsicDelay(this);
const float capacitance1 = port1->capacitance();
const float capacitance2 = port2->capacitance();
return std::tie(drive2, intrinsic1, capacitance1)
< std::tie(drive1, intrinsic2, capacitance2);
});
const float drive = drvr_port->cornerPort(lib_ap)->driveResistance();
const float delay
= resizer_->gateDelay(drvr_port, load_cap, resizer_->tgt_slew_dcalc_ap_)
+ prev_drive * in_port->cornerPort(lib_ap)->capacitance();

for (LibertyCell* swappable : swappable_cells) {
LibertyCell* swappable_corner = swappable->cornerCell(lib_ap);
LibertyPort* swappable_drvr
= swappable_corner->findLibertyPort(drvr_port_name);
LibertyPort* swappable_input
= swappable_corner->findLibertyPort(in_port_name);
const float swappable_drive = swappable_drvr->driveResistance();
// Include delay of previous driver into swappable gate.
const float swappable_delay
= resizer_->gateDelay(swappable_drvr, load_cap, dcalc_ap)
+ prev_drive * swappable_input->capacitance();
if (swappable_drive < drive && swappable_delay < delay) {
return swappable;
}
}
}
return nullptr;
};

// Replace LEF with LEF so ports stay aligned in instance.
bool BaseMove::replaceCell(Instance* inst, const LibertyCell* replacement)
{
const char* replacement_name = replacement->name();
dbMaster* replacement_master = db_->findMaster(replacement_name);

if (replacement_master) {
dbInst* dinst = db_network_->staToDb(inst);
dbMaster* master = dinst->getMaster();
resizer_->designAreaIncr(-area(master));
Cell* replacement_cell1 = db_network_->dbToSta(replacement_master);
sta_->replaceCell(inst, replacement_cell1);
resizer_->designAreaIncr(area(replacement_master));

// Legalize the position of the instance in case it leaves the die
if (resizer_->getParasiticsSrc() == ParasiticsSrc::global_routing
|| resizer_->getParasiticsSrc() == ParasiticsSrc::detailed_routing) {
opendp_->legalCellPos(db_network_->staToDb(inst));
}
if (resizer_->haveEstimatedParasitics()) {
auto pin_iter
= std::unique_ptr<InstancePinIterator>(network_->pinIterator(inst));
while (pin_iter->hasNext()) {
const Pin* pin = pin_iter->next();
const Net* net = network_->net(pin);
odb::dbNet* db_net = nullptr;
odb::dbModNet* db_modnet = nullptr;
db_network_->staToDb(net, db_net, db_modnet);
// only work on dbnets
resizer_->invalidateParasitics(pin, db_network_->dbToSta(db_net));
// invalidateParasitics(pin, net);
}
}

return true;
}
return false;
}

vector<const Pin*> BaseMove::getFanouts(const Instance* inst)
{
vector<const Pin*> fanouts;

auto pin_iter
= std::unique_ptr<InstancePinIterator>(network_->pinIterator(inst));
while (pin_iter->hasNext()) {
const Pin* pin = pin_iter->next();
if (network_->direction(pin)->isOutput()) {
fanouts.push_back(pin);
}
}

return fanouts;
}

////////////////////////////////////////////////////////////////
// namespace rsz
} // namespace rsz
13 changes: 13 additions & 0 deletions src/rsz/src/BaseMove.hh
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ class BaseMove : public sta::dbStaState
int hasPendingMoves(Instance* inst) const;
// Total optimizations
int numCommittedMoves() const;
// Total rejected count
int numRejectedMoves() const;
// Whether this optimization is committed or pending
int hasMoves(Instance* inst) const;
// Total accepted and pending optimizations
Expand Down Expand Up @@ -154,6 +156,8 @@ class BaseMove : public sta::dbStaState
InstanceSet pending_inst_set_;
int pending_count_ = 0;
int all_count_ = 0;
int rejected_count_ = 0;
int accepted_count_ = 0;

// Use actual input slews for accurate delay/slew estimation
sta::UnorderedMap<LibertyPort*, InputSlews> input_slew_map_;
Expand Down Expand Up @@ -213,10 +217,19 @@ class BaseMove : public sta::dbStaState
void getBufferPins(Instance* buffer, Pin*& ip, Pin*& op);
int fanout(Vertex* vertex);

LibertyCell* upsizeCell(LibertyPort* in_port,
LibertyPort* drvr_port,
float load_cap,
float prev_drive,
const DcalcAnalysisPt* dcalc_ap);
bool replaceCell(Instance* inst, const LibertyCell* replacement);

static constexpr int rebuffer_max_fanout_ = 20;
static constexpr int split_load_min_fanout_ = 8;
static constexpr int buffer_removal_max_fanout_ = 10;
static constexpr float rebuffer_relaxation_factor_ = 0.03;

vector<const Pin*> getFanouts(const Instance* inst);
};

} // namespace rsz
12 changes: 10 additions & 2 deletions src/rsz/src/BufferMove.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,20 @@ bool BufferMove::doMove(const Path* drvr_path,
rebuffer_count);
debugPrint(logger_,
RSZ,
"moves",
"opt_moves",
1,
"rebuffer {} inserted {}",
"ACCEPT buffer {} inserted {}",
network_->pathName(drvr_pin),
rebuffer_count);
addMove(drvr_inst, rebuffer_count);
} else {
debugPrint(logger_,
RSZ,
"opt_moves",
3,
"REJECT buffer {} inserted {}",
network_->pathName(drvr_pin),
rebuffer_count);
}
return rebuffer_count > 0;
}
Expand Down
Loading