Skip to content

Commit 848d1e7

Browse files
authored
Merge pull request #2866 from verilog-to-routing/v9.0.0-rebased-with-master
Add the latest changes from v9.0 branch to the master branch
2 parents 55f1490 + ae44b3b commit 848d1e7

File tree

186 files changed

+5097
-4236
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

186 files changed

+5097
-4236
lines changed

CHANGELOG.md

+58
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,64 @@ _The following are changes which have been implemented in the VTR master branch
4747

4848
### Removed
4949

50+
51+
## v9.0.0 - 2024-12-23
52+
53+
### Added
54+
* Support for Advanced Architectures:
55+
* 3D FPGA and RAD architectures.
56+
* Architectures with hard Networks-on-Chip (NoCs).
57+
* Distinct horizontal and vertical channel widths and types.
58+
* Diagonal routing wires and other complex wire shapes (L-shaped, T-shaped, ....).
59+
60+
* New Benchmark Suites:
61+
* Koios: A deep-learning-focused benchmark suite with various design sizes.
62+
* Hermes: Benchmarks utilizing hard NoCs.
63+
* TitanNew: Large benchmarks targeting the Stratix 10 architecture.
64+
65+
* Commercial FPGAs Architecture Captures:
66+
* Intel’s Stratix 10 FPGA architecture.
67+
* AMD’s 7-series FPGA architecture.
68+
69+
* Parmys Logic Synthesis Flow:
70+
* Better Verilog language coverage
71+
* More efficient hard block mapping
72+
73+
* VPR Graphics Visualizations:
74+
* New interface for improved usability and underlying graphics rewritten using EZGL/GTK to allow more UI widgets.
75+
* Algorithm breakpoint visualizations for placement and routing algorithm debugging.
76+
* User-guided (manual) placement optimization features.
77+
* Enabled a live connection for client graphical application to VTR engines through sockets (server mode).
78+
* Interactive timing path analysis (IPA) client using server mode.
79+
80+
* Performance Enhancements:
81+
* Parallel router for faster inter-cluster routing or flat routing.
82+
83+
* Re-clustering API to modify packing decisions during the flow.
84+
* Support for floorplanning and placement constraints.
85+
* Unified intra- and inter-cluster (flat) routing.
86+
* Comprehensive web-based VTR utilities and API documentation.
87+
88+
### Changed
89+
* The default values of many command line options (e.g. inner_num is 0.5 instead of 1.0)
90+
* Changes to placement engine
91+
* Smart centroid initial placement algorithm.
92+
* Multiple smart placement directed moves.
93+
* Reinforcement learning-based placement algorithm.
94+
* Changes to routing engine
95+
* Faster lookahead creation.
96+
* More accurate lookahead for large blocks.
97+
* More efficient heap and pruning strategies.
98+
* max `pres_fac` capped to avoid possible numeric issues.
99+
100+
101+
### Fixed
102+
* Many algorithmic and coding bugs are fixed in this release
103+
104+
### Removed
105+
* Breadth-first (non-timing-driven) router.
106+
* Non-linear congestion placement cost.
107+
50108
## v8.0.0 - 2020-03-24
51109

52110
### Added

CMakeLists.txt

+2-2
Original file line numberDiff line numberDiff line change
@@ -62,8 +62,8 @@ option(ODIN_SANITIZE "Enable building odin with sanitize flags" OFF)
6262
option(WITH_PARMYS "Enable Yosys as elaborator and parmys-plugin as partial mapper" ON)
6363
option(YOSYS_F4PGA_PLUGINS "Enable building and installing Yosys SystemVerilog and UHDM plugins" OFF)
6464

65-
set(VTR_VERSION_MAJOR 8)
66-
set(VTR_VERSION_MINOR 1)
65+
set(VTR_VERSION_MAJOR 9)
66+
set(VTR_VERSION_MINOR 0)
6767
set(VTR_VERSION_PATCH 0)
6868
set(VTR_VERSION_PRERELEASE "dev")
6969

README.developers.md

+15-12
Original file line numberDiff line numberDiff line change
@@ -637,6 +637,10 @@ They can be used for FPGA architecture exploration for DL and also for tuning CA
637637

638638
A typical approach to evaluating an algorithm change would be to run `koios_medium` (or `koios_medium_no_hb`) tasks from the nightly regression test (vtr_reg_nightly_test4), the `koios_large` (or `koios_large_no_hb`) and the `koios_proxy` (or `koios_proxy_no_hb`) tasks from the weekly regression test (vtr_reg_weekly). The nightly test contains smaller benchmarks, whereas the large designs are in the weekly regression test. To measure QoR for the entire benchmark suite, both nightly and weekly tests should be run and the results should be concatenated.
639639

640+
As 3 of the `koios_large` circuits require special settings due to having long DSP chains, they are split in separate tasks as follows:
641+
* `bwave_like.float.large.v` and `bwave_like.fixed.large.v` are in `vtr_reg_weekly/koios_bwave_large` task
642+
* `dla_like.large.v` is in `vtr_reg_weekly/koios_dla_large` task
643+
640644
For evaluating an algorithm change in the Odin frontend, run `koios_medium` (or `koios_medium_no_hb`) tasks from the nightly regression test (vtr_reg_nightly_test4_odin) and the `koios_large_odin` (or `koios_large_no_hb_odin`) tasks from the weekly regression test (vtr_reg_weekly).
641645

642646
The `koios_medium`, `koios_large`, and `koios_proxy` regression tasks run these benchmarks with complex_dsp functionality enabled, whereas `koios_medium_no_hb`, `koios_large_no_hb` and `koios_proxy_no_hb` regression tasks run these benchmarks without complex_dsp functionality. Normally, only the `koios_medium`, `koios_large`, and `koios_proxy` tasks should be enough for QoR.
@@ -651,6 +655,8 @@ The following table provides details on available Koios settings in VTR flow:
651655
| Nightly | Medium designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | ✓ | vtr_reg_nightly_test4_odin/koios_medium | Odin | |
652656
| Nightly | Medium designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | | vtr_reg_nightly_test4_odin/koios_medium_no_hb | Odin | |
653657
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | ✓ | vtr_reg_weekly/koios_large | Parmys | |
658+
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | ✓ | vtr_reg_weekly/koios_dla_large | Parmys | |
659+
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | ✓ | vtr_reg_weekly/koios_bwave_large | Parmys | |
654660
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | | vtr_reg_weekly/koios_large_no_hb | Parmys | |
655661
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | ✓ | vtr_reg_weekly/koios_large_odin | Odin | |
656662
| Weekly | Large designs | k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml | | vtr_reg_weekly/koios_large_no_hb_odin | Odin | |
@@ -661,7 +667,15 @@ The following table provides details on available Koios settings in VTR flow:
661667

662668
For more information refer to the [Koios benchmark home page](vtr_flow/benchmarks/verilog/koios/README.md).
663669

664-
The following steps show a sequence of commands to run the `koios` tasks on the Koios benchmarks:
670+
To make running all the koios benchmarks easier, especially with thos circuits scattered between different tasks, there is an overall task list that runs all the 40 circuits of Koios as follows (this will run all the circuits with complex DSP functionality enabled. If you want to disable the complex DSP, edit the file to point to the `koios_*_no_hb` tasks):
671+
672+
```shell
673+
$ ../scripts/run_vtr_task.py -l koios_task_list.txt
674+
675+
#Several hours later... they complete
676+
#
677+
678+
If you want to run a subset of the koios benchmarks or run them without hard DSP blocks, you can run lower-level 'koios' tasks as follows:
665679

666680
```shell
667681
#From the VTR root
@@ -681,17 +695,6 @@ $ ../scripts/run_vtr_task.py regression_tests/vtr_reg_weekly/koios_sv_no_hb &
681695
682696
#Several hours later... they complete
683697
684-
#Parse the results
685-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_nightly_test4/koios_medium
686-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_large
687-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_proxy
688-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_sv
689-
690-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_nightly_test4/koios_medium_no_hb
691-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_large_no_hb
692-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_proxy_no_hb
693-
$ ../scripts/python_libs/vtr/parse_vtr_task.py regression_tests/vtr_reg_weekly/koios_sv_no_hb
694-
695698
#The run directory should now contain a summary parse_results.txt file
696699
$ head -5 vtr_reg_nightly_test4/koios_medium/<latest_run_dir>/parse_results.txt
697700
arch circuit script_params vtr_flow_elapsed_time vtr_max_mem_stage vtr_max_mem error odin_synth_time max_odin_mem parmys_synth_time max_parmys_mem abc_depth abc_synth_time abc_cec_time abc_sec_time max_abc_mem ace_time max_ace_mem num_clb num_io num_memories num_mult vpr_status vpr_revision vpr_build_info vpr_compiler vpr_compiled hostname rundir max_vpr_mem num_primary_inputs num_primary_outputs num_pre_packed_nets num_pre_packed_blocks num_netlist_clocks num_post_packed_nets num_post_packed_blocks device_width device_height device_grid_tiles device_limiting_resources device_name pack_mem pack_time placed_wirelength_est place_mem place_time place_quench_time placed_CPD_est placed_setup_TNS_est placed_setup_WNS_est placed_geomean_nonvirtual_intradomain_critical_path_delay_est place_delay_matrix_lookup_time place_quench_timing_analysis_time place_quench_sta_time place_total_timing_analysis_time place_total_sta_time min_chan_width routed_wirelength min_chan_width_route_success_iteration logic_block_area_total logic_block_area_used min_chan_width_routing_area_total min_chan_width_routing_area_per_tile min_chan_width_route_time min_chan_width_total_timing_analysis_time min_chan_width_total_sta_time crit_path_routed_wirelength crit_path_route_success_iteration crit_path_total_nets_routed crit_path_total_connections_routed crit_path_total_heap_pushes crit_path_total_heap_pops critical_path_delay geomean_nonvirtual_intradomain_critical_path_delay setup_TNS setup_WNS hold_TNS hold_WNS crit_path_routing_area_total crit_path_routing_area_per_tile router_lookahead_computation_time crit_path_route_time crit_path_total_timing_analysis_time crit_path_total_sta_time

0 commit comments

Comments
 (0)