-
Notifications
You must be signed in to change notification settings - Fork 415
Allow router to report successful routing even if RCV fails to resolve hold #1540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I don't want to merge this yet, as I'm hoping to get around to figuring out why the LU circuits are failing, and fix those. |
@dpbaines -- Any chance we could fail by default (and instead add a flag which allows things to report success even if there are hold violations)? Designs with hold violations are broken. Reporting that PnR is successful when the design is broken violates the contract that a successful exit code means a successfully place and routed design which will work. |
@mithro I think we should report routing success in this case; the hold violation will still be reported, so reporting routing success does not imply timing success. This matches what commercial tools do -- Quartus will report a successful routing but failed hold (or setup) timing constraints if any failed constraints are present. You can still analyze routability, look at the routing, get a timing report etc. Simply stopping makes it harder to look at the output (don't get a full analysis), and precludes programming a board (useful as hold margins usually have guardband, so you could still test to see if it works on a board. If the failure is small, it usually will work). |
vpr/src/route/route_timing.cpp
Outdated
@@ -1987,9 +1987,11 @@ bool is_iteration_complete(bool routing_is_feasible, const t_router_opts& router | |||
//in addition to routing being legal and the correct budgeting algorithm being set. | |||
|
|||
if (routing_is_feasible) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would re-order this if
Test the non-RCV (routing_budgets_algorithm != YOYO) first, so someone who is not thinking of RCV has less to read. Then the rest of the code.
vpr/src/route/route_timing.cpp
Outdated
if (router_opts.routing_budgets_algorithm != YOYO) { | ||
return true; | ||
} else if (router_opts.routing_budgets_algorithm == YOYO && (timing_info->hold_worst_negative_slack() == 0 || rcv_finished) && itry != 1) { | ||
} else if (router_opts.routing_budgets_algorithm == YOYO && (timing_info->hold_worst_negative_slack() == 0 || can_router_finish) && itry != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the test for hold_worst_negative_slack be >= 0? Seems a lot safer than a test for precisely 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree, but worst_negative_slack is equal to zero if worst hold slack is positive. Therefore they would equate to the same thing, but I could change it in case that changes in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, lets do >= for safety and readability though.
I think a cleaner fix is possible.
I think this would be cleaner and more powerful, but it would take more testing.
|
@vaughnbetz Your suggestions make sense, I'll try to take a better look at them this week at some point. |
…s if RCV is enabled, with others
@vaughnbetz I've implemented your suggestions so far. I've eliminated the is_iteration_complete function, and instead modified the is_better_quality_routing function. If RCV is enabled the is_better_quality_routing function will prioritize routing with improved hold slack, and then setup slack. Otherwise it uses the old code path. It will also indicate that routing was successful, however it won't allow the router to break early unless either hold slack >= 0, or RCV detects it can finish early, as exists in the current code. I haven't had a lot of time to figure out why LU8 and LU32 struggle to resolve, so I wanna keep picking away at it. Let me know what you think of the current code. Or if you think is_better_quality should always prioritize hold even in the regular router (which would clean up code). I have tested this on a few key circuits and it seems to be working well. For LU8 and LU32 it returns the routing that had the lowest negative hold slack. |
vpr/src/route/route_timing.cpp
Outdated
@@ -2007,35 +2006,66 @@ bool should_setup_lower_bound_connection_delays(int itry, const t_router_opts& / | |||
static bool is_better_quality_routing(const vtr::vector<ClusterNetId, t_traceback>& best_routing, | |||
const RoutingMetrics& best_routing_metrics, | |||
const WirelengthInfo& wirelength_info, | |||
std::shared_ptr<const SetupHoldTimingInfo> timing_info) { | |||
std::shared_ptr<const SetupHoldTimingInfo> timing_info, | |||
const t_router_opts& router_opts) { | |||
if (best_routing.empty()) { | |||
return true; //First legal routing | |||
} | |||
|
|||
//Rank first based on sWNS, followed by other timing metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't seem to match the code below.
vpr/src/route/route_timing.cpp
Outdated
} else if (timing_info->setup_total_negative_slack() < best_routing_metrics.sTNS) { | ||
return false; | ||
} | ||
if (timing_info->hold_worst_negative_slack() > best_routing_metrics.hWNS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the RCV off case? If so, why is it making decisions based on hold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that check was there as a secondary quality check if the setup slacks were tied, to allow the router to pick routing with improved hold. I get that this would be unlikely though, so I guess it makes sense to remove this check for non-RCV routing.
vpr/src/route/route_timing.cpp
Outdated
} | ||
if (timing_info->hold_worst_negative_slack() > best_routing_metrics.hWNS) { | ||
return true; | ||
} else if (timing_info->hold_worst_negative_slack() > best_routing_metrics.hWNS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the if and else if condition seem to be the same? Copy and paste error?
vpr/src/route/route_timing.cpp
Outdated
} else if (timing_info->hold_worst_negative_slack() > best_routing_metrics.hWNS) { | ||
return false; | ||
} | ||
if (timing_info->hold_total_negative_slack() > best_routing_metrics.hTNS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment as this looks like a repeat of the prior 5 lines of code until you look very closely (total negative slack instead of worst negative slack).
vpr/src/route/route_timing.cpp
Outdated
return true; | ||
} else if (timing_info->hold_total_negative_slack() > best_routing_metrics.hTNS) { | ||
return false; | ||
if (timing_info->setup_worst_negative_slack() > best_routing_metrics.sWNS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could reduce code duplication and make the code more readable by making helper functions: setup_comparison and hold_comparison that encapsulate the (1) test worst case slack first and then check total negative slack, and return better, worse or tied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those routines would also give you a place to comment that we don't care about positive slack (consider it a tie) and only check to see if negative slacks have improved.
|| early_reconvergence_exit_heuristic(router_opts, itry_since_last_convergence, timing_info, best_routing_metrics)) { | ||
if ((legal_convergence_count >= router_opts.max_convergence_count | ||
|| router_iteration_stats.connections_routed == 0 | ||
|| early_reconvergence_exit_heuristic(router_opts, itry_since_last_convergence, timing_info, best_routing_metrics)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could move hold_legal into the early_reconvergence_exit_heuristic routine, which I think would be easier to read by simplifying this condition and giving you a place to comment why you're not saying you're exiting early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that would be easier to read, but because the default max_convergence_count is 1, the router would stop early regardless. What I could do is to move all of these conditions into a seperate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still outstanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When RCV is selected you could change the legal_convergence_count default to some larger number (we can have dependent option defaults; take a look at the option parsing code and you should find an example). Then if the early_reconvergence_exit_heuristic understands hold and rcv it can do the right thing. The if statement won't look as long.
You could also move the whole if into a control routine; that would also work. But I think changing legal_convergence_count makes sense no matter what, as I believe you've found RCV wants a higher value for this setting. Better to do that by actually changing the value than writing if statements to work around it.
Thanks David. I have some questions and suggestions above. One is major: I think the code is backwards, and others are useful for readability. |
@@ -579,11 +588,17 @@ bool try_timing_driven_route_tmpl(const t_router_opts& router_opts, | |||
pres_fac = update_pres_fac(router_opts.initial_pres_fac); | |||
} | |||
|
|||
if (budgeting_inf.if_set()) { | |||
// If total negative slack is greater than or equal to 0, or RCV is finished, allow router to complete | |||
hold_legal = timing_info->hold_total_negative_slack() >= 0 || rcv_finished_count == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this variable doesn't really mean hold is legal (it seems to mean RCV should stop trying) then you should give it a different name.
} else if (timing_info->hold_worst_negative_slack() > best_routing_metrics.hWNS) { | ||
return false; | ||
} | ||
if (hold_comparison > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code could be shortened by checking if YOYO is enabled, and if so, checking hold and returning whatever is better, or doing nothing in a tie. Then check setup (no matter what), then wirelength.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I should have seen this! Will fix thanks.
The compare_setup and compare_hold routines have cleaned up some code. I left some more comments on ways to shorten / clarify further. |
Description
Makes it so that RCV will report a successful routing regardless of whether hold has been resolved. This is just a simple check that allows it to report a successful routing if routing is feasible, and the maximum iteration count is reached. This doesn't fix the underlying problem of why RCV fails to resolve hold for some circuits (LU8PEEng and LU32PEEng) but it's a quick patch.
I want to figure out the underlying issue and also address that in this PR so I don't want to merge quite yet.
Related Issue
Closes #1533
Motivation and Context
Make RCV more robust and versatile on multiple circuits.
How Has This Been Tested?
I tested this on LU32PEEng and LU8PEEng circuits with harsh hold constraints and RCV enabled, and verified that it reports a successful routing even though hold isn't resolved. I don't see a point in running the full qor tests as this isn't a possibly breaking change yet.
Types of changes
Checklist: