Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
6859e69
use SV39 membox to replace memory buffers implementation
Humber-186 Jul 16, 2024
6084e5a
Update gitignore
Humber-186 Jul 19, 2024
873fa8d
Merge branch 'main' into membox
Humber-186 Jul 19, 2024
1eb4724
Bug fixed & use log.c
Humber-186 Jul 21, 2024
6abe977
Let log.c use systemc timestamp
Humber-186 Jul 21, 2024
cde0cfb
task功能初步支持
Humber-186 Jul 22, 2024
aff6776
cout "\n" changed to endl
Humber-186 Jul 24, 2024
4dbefa1
task feature complete
Humber-186 Jul 24, 2024
6da3b4f
cmd arg --help text
Humber-186 Jul 29, 2024
5133468
移除未使用的头文件,增加TODO注释
Humber-186 Feb 15, 2025
166440c
使SM完成warp后回调通知CTAsche,从而能够获悉各warp/block结束的具体时刻
Humber-186 Feb 16, 2025
ac84abe
add warp&block finish callback
Humber-186 Feb 20, 2025
75bb2ad
code cleanup: CTA scheduler
Humber-186 Feb 21, 2025
467a12a
code cleanup: SM m_hw_warps[warp_id] -> hwarp
Humber-186 Feb 21, 2025
4046df2
增加硬件block_slot & bug fixed
Humber-186 Mar 3, 2025
623fc2d
支持更多指令,通过MNIST测例
Humber-186 Mar 4, 2025
b537f57
增加指令支持,更改命令行参数
Humber-186 Mar 6, 2025
eafb3d0
打包运行库:初步实现对外API
Humber-186 Mar 17, 2025
24683a0
code format
Humber-186 Mar 22, 2025
e3fea3a
export shared lib for driver ok
Humber-186 Mar 23, 2025
9808a10
code reorganize
Humber-186 Mar 30, 2025
12e88ef
bug fixed (regext s3/d vmadd)
Humber-186 Apr 3, 2025
e944be6
use new membox (SV39)
Humber-186 Apr 3, 2025
0c4407b
fix missing git submodule path
Humber-186 Apr 3, 2025
b665655
update dependence membox
Humber-186 Apr 6, 2025
970cc1d
rodinia gaussian passed
Humber-186 Apr 9, 2025
ace84f3
bug fixed & add instruction
Humber-186 Apr 23, 2025
46eb521
update submodule
Humber-186 May 11, 2025
53bc682
额外的译码信号
Humber-186 May 11, 2025
82a1377
bug fixed in stimuli
Humber-186 May 11, 2025
9079058
LSU, ramulator2, spdlog
Humber-186 May 11, 2025
6dc6af4
bug fixed
Humber-186 May 11, 2025
47e3f9e
fix missing dep lib path
Humber-186 May 11, 2025
9c7b7b3
update dep ramulator
Humber-186 May 11, 2025
770fffa
use spdlog remove log.c
Humber-186 May 11, 2025
d1eb539
性能优化
Humber-186 May 14, 2025
277905a
scalar lsu显式处理 & log improved
Humber-186 May 20, 2025
c628d59
subcore firstly ok
Humber-186 May 20, 2025
598f3f3
Bug fixed & small changes
Humber-186 Jun 10, 2025
506de3c
fix cmake install problems
Humber-186 Jun 20, 2025
d2f9c33
support vcd trace dump
Humber-186 Jul 22, 2025
6136bde
small changes: log, code format, anti-copy ......
Humber-186 Aug 23, 2025
87c38c7
fix: 限制OPC中每个warp仅能存在1个指令,防止OPC出口单warp乱序
Humber-186 Aug 23, 2025
72b08ae
add: 仅在本仿真器中使用的自定义print指令,辅助调试
Humber-186 Aug 23, 2025
42e0321
add: cmake accepts -DSYSTEMC_HOME as well
Humber-186 Sep 13, 2025
77a7405
add: README update
Humber-186 Sep 13, 2025
eea7529
fix: cmdarg --sim-time-max same as RTLSIM
Humber-186 Sep 13, 2025
d4766e5
feat: allow disable DDR timing
Humber-186 Oct 4, 2025
6682abe
fix(build): enable assert for RelWithDebInfo
Humber-186 Oct 4, 2025
335ba24
fix: add VFPU instr impl VFMAX, VFMIN
Humber-186 Oct 5, 2025
650b974
feat: icache
Humber-186 Nov 15, 2025
3549bf2
feat: thread index in hardware CSR, support CSRRSV instr
Humber-186 Dec 8, 2025
b3ec5d4
cyclesim: expose static model params via API
Humber-186 Mar 4, 2026
aa08cd0
cyclesim: compute CSR_PDS by resident (sm_id, cta_slot)
Humber-186 Mar 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
192 changes: 192 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
---
Language: Cpp
# BasedOnStyle: WebKit
AccessModifierOffset: -4
AlignAfterOpenBracket: BlockIndent
AlignArrayOfStructures: None
AlignConsecutiveMacros: None
AlignConsecutiveAssignments: false
AlignConsecutiveBitFields: None
AlignConsecutiveDeclarations: None
AlignEscapedNewlines: Right
AlignOperands: DontAlign
AlignTrailingComments: true
AllowAllArgumentsOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: true
AllowShortEnumsOnASingleLine: true
AllowShortBlocksOnASingleLine: Empty
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: All
AllowShortLambdasOnASingleLine: All
AllowShortIfStatementsOnASingleLine: Never
AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterDefinitionReturnType: None
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: MultiLine
AttributeMacros:
- __capability
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
AfterCaseLabel: false
AfterClass: false
AfterControlStatement: Never
AfterEnum: false
AfterFunction: false
AfterNamespace: false
AfterObjCDeclaration: false
AfterStruct: false
AfterUnion: false
AfterExternBlock: false
BeforeCatch: false
BeforeElse: false
BeforeLambdaBody: false
BeforeWhile: false
IndentBraces: false
SplitEmptyFunction: true
SplitEmptyRecord: true
SplitEmptyNamespace: true
BreakBeforeBinaryOperators: All
BreakBeforeConceptDeclarations: true
BreakBeforeBraces: Custom
BreakBeforeInheritanceComma: false
BreakInheritanceList: BeforeColon
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
BreakConstructorInitializers: BeforeComma
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 100
CommentPragmas: '^ IWYU pragma:'
QualifierAlignment: Leave
CompactNamespaces: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: false
DeriveLineEnding: true
DerivePointerAlignment: false
DisableFormat: false
EmptyLineAfterAccessModifier: Never
EmptyLineBeforeAccessModifier: LogicalBlock
ExperimentalAutoDetectBinPacking: false
PackConstructorInitializers: BinPack
BasedOnStyle: ''
ConstructorInitializerAllOnOneLineOrOnePerLine: false
AllowAllConstructorInitializersOnNextLine: true
FixNamespaceComments: false
ForEachMacros:
- foreach
- Q_FOREACH
- BOOST_FOREACH
IfMacros:
- KJ_IF_MAYBE
IncludeBlocks: Preserve
IncludeCategories:
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
Priority: 2
SortPriority: 0
CaseSensitive: false
- Regex: '^(<|"(gtest|gmock|isl|json)/)'
Priority: 3
SortPriority: 0
CaseSensitive: false
- Regex: '.*'
Priority: 1
SortPriority: 0
CaseSensitive: false
IncludeIsMainRegex: '(Test)?$'
IncludeIsMainSourceRegex: ''
IndentAccessModifiers: false
IndentCaseLabels: false
IndentCaseBlocks: false
IndentGotoLabels: true
IndentPPDirectives: None
IndentExternBlock: AfterExternBlock
IndentRequires: false
IndentWidth: 4
IndentWrappedFunctionNames: false
InsertTrailingCommas: None
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: true
LambdaBodyIndentation: Signature
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: Inner
ObjCBinPackProtocolList: Auto
ObjCBlockIndentWidth: 4
ObjCBreakBeforeNestedBlockParam: true
ObjCSpaceAfterProperty: true
ObjCSpaceBeforeProtocolList: true
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 19
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakOpenParenthesis: 0
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 100000
PenaltyIndentedWhitespace: 0
PointerAlignment: Left
PPIndentWidth: -1
ReferenceAlignment: Pointer
ReflowComments: true
RemoveBracesLLVM: false
SeparateDefinitionBlocks: Leave
ShortNamespaceLines: 1
SortIncludes: CaseSensitive
SortJavaStaticImport: Before
SortUsingDeclarations: true
SpaceAfterCStyleCast: false
SpaceAfterLogicalNot: false
SpaceAfterTemplateKeyword: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeCaseColon: false
SpaceBeforeCpp11BracedList: true
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeParens: ControlStatements
SpaceBeforeParensOptions:
AfterControlStatements: true
AfterForeachMacros: true
AfterFunctionDefinitionName: false
AfterFunctionDeclarationName: false
AfterIfMacros: true
AfterOverloadedOperator: false
BeforeNonEmptyParentheses: false
SpaceAroundPointerQualifiers: Default
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyBlock: true
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: Never
SpacesInConditionalStatement: false
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInLineCommentPrefix:
Minimum: 1
Maximum: -1
SpacesInParentheses: false
SpacesInSquareBrackets: false
SpaceBeforeSquareBrackets: false
BitFieldColonSpacing: Both
Standard: Latest
StatementAttributeLikeMacros:
- Q_EMIT
StatementMacros:
- Q_UNUSED
- QT_REQUIRE_VERSION
TabWidth: 8
UseCRLF: false
UseTab: Never
WhitespaceSensitiveMacros:
- STRINGIZE
- PP_STRINGIZE
- BOOST_PP_STRINGIZE
- NS_SWIFT_NAME
- CF_SWIFT_NAME
...

3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,6 @@
.cache/
compile_commands.json
log.txt
.vscode/
build/
.xmake/
9 changes: 6 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
[submodule "submodules/membox2"]
path = submodules/membox2
url = git@github.com:liuxd17thu/membox2.git
[submodule "dependencies/membox"]
path = dependencies/membox
url = https://github.com/THU-DSP-LAB/membox.git
[submodule "dependencies/ramulator2"]
path = dependencies/ramulator2
url = https://github.com/THU-DSP-LAB/ramulator2.git
124 changes: 124 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
cmake_minimum_required(VERSION 3.15)
project(VentusSimulator
VERSION 1.0
LANGUAGES C CXX)

#
# global settings
#

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
# use ccache
set(CMAKE_C_COMPILER_LAUNCHER "ccache")
set(CMAKE_CXX_COMPILER_LAUNCHER "ccache")
# use -O2 & enable assert & LTO for release build
set(CMAKE_CXX_FLAGS_RELEASE "-O2")
set(CMAKE_C_FLAGS_RELEASE "-O2")
add_compile_options(
$<$<AND:$<CONFIG:Release>,$<NOT:$<CXX_COMPILER_ID:MSVC>>>:-UNDEBUG>
$<$<AND:$<CONFIG:RelWithDebInfo>,$<NOT:$<CXX_COMPILER_ID:MSVC>>>:-UNDEBUG>
$<$<AND:$<CONFIG:Release>,$<CXX_COMPILER_ID:MSVC>>:/U"NDEBUG">
$<$<AND:$<CONFIG:RelWithDebInfo>,$<CXX_COMPILER_ID:MSVC>>:/U"NDEBUG">
)
# set(CMAKE_INTERPROCEDURAL_OPTIMIZATION_RELEASE TRUE)
# use spdlog trace level for debug build
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
add_compile_definitions(SPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_TRACE)
else()
add_compile_definitions(SPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_TRACE)
endif()
# export compile_commands.json
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

#
# dependency packages
#

find_package(fmt REQUIRED)
find_package(spdlog REQUIRED)

# Get SystemC PATH from user
if(DEFINED SYSTEMC_HOME AND NOT SYSTEMC_HOME STREQUAL "")
# use cmake cmd arg -DSYSTEMC_HOME
elseif(DEFINED ENV{SYSTEMC_HOME} AND NOT "$ENV{SYSTEMC_HOME}" STREQUAL "")
# use environment variable
set(SYSTEMC_HOME "$ENV{SYSTEMC_HOME}" CACHE PATH "SystemC install root (from environment)" FORCE)
else() # Error
message(FATAL_ERROR "SYSTEMC_HOME not set."
"Please use -DSYSTEMC_HOME=/path/to/systemc or set SYSTEMC_HOME in the environment.")
endif()
file(TO_CMAKE_PATH "${SYSTEMC_HOME}" SYSTEMC_HOME) # canonicalize path
if(NOT EXISTS "${SYSTEMC_HOME}")
message(FATAL_ERROR "SYSTEMC_HOME points to a non-existent directory: ${SYSTEMC_HOME}")
endif()
set(SYSTEMC_INCLUDE_DIR ${SYSTEMC_HOME}/include)
set(SYSTEMC_LIBRARY_DIR ${SYSTEMC_HOME}/lib-linux64)

#
# include submodules
#
add_subdirectory(dependencies/membox)
add_subdirectory(dependencies/ramulator2)

#
# target libVentusCycleSim.so
#
file(GLOB_RECURSE SM_SOURCES "${CMAKE_CURRENT_SOURCE_DIR}/src/sm/*.cpp")
add_library(VentusCycleSim SHARED
${SM_SOURCES}
src/context_model.cpp
src/CTA_Scheduler.cpp
src/parameters.cpp
src/top_gpgpu.cpp
src/ramulator.cpp
src/ventus_cyclesim.cpp
src/ventus_cyclesim_impl.cpp
)
target_include_directories(VentusCycleSim
PRIVATE ${SYSTEMC_INCLUDE_DIR}
)
target_link_directories(VentusCycleSim
PRIVATE ${SYSTEMC_LIBRARY_DIR}
)
target_link_libraries(VentusCycleSim
PRIVATE SV
PRIVATE ramulator
PRIVATE systemc
PRIVATE spdlog
PRIVATE fmt
)
set_target_properties(VentusCycleSim PROPERTIES
# libramulator.so will be installed together, use $ORIGIN
INSTALL_RPATH "${SYSTEMC_LIBRARY_DIR};$ORIGIN"
PUBLIC_HEADER "${CMAKE_CURRENT_SOURCE_DIR}/src/ventus_cyclesim.h"
)
target_compile_definitions(VentusCycleSim PRIVATE
VENTUS_CYCLESIM_PROJECT_DIR="${CMAKE_CURRENT_SOURCE_DIR}"
)
install(TARGETS VentusCycleSim
LIBRARY DESTINATION lib
RUNTIME DESTINATION bin
ARCHIVE DESTINATION lib
PUBLIC_HEADER DESTINATION include
)

#
# target main exe
#
add_executable(main
src/cmdarg.cpp
src/parse_kernel.cpp
src/main.cpp
src/task.cpp
)
target_link_libraries(main
PRIVATE VentusCycleSim
PRIVATE spdlog
PRIVATE fmt
)
target_compile_definitions(main PRIVATE
VENTUS_CYCLESIM_PROJECT_DIR="\"${CMAKE_CURRENT_SOURCE_DIR}\""
)

33 changes: 32 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,38 @@ $(BINARY): $(OBJS)
@mkdir -p $(dir $@)
@$(LD) -o $@ $(OBJS) $(LDFLAGS)

RUNFLAGS = --numkernel 2 vecadd adv_vecadd/vecadd4x4.metadata adv_vecadd/vecadd4x4.data matadd multiblock/matadd/matadd.metadata multiblock/matadd/matadd.data --numcycle 1000000
RUNFLAGS = --task name=MNIST \
--kernel taskid=0,name=MNIST_0,metafile=testcase/mnist/conv_0.metadata,datafile=testcase/mnist/conv_0.data \
--kernel taskid=0,name=MNIST_1,metafile=testcase/mnist/conv_1.metadata,datafile=testcase/mnist/conv_1.data \
--kernel taskid=0,name=MNIST_2,metafile=testcase/mnist/conv_2.metadata,datafile=testcase/mnist/conv_2.data \
--numcycle 30000000
RUNFLAGS = --task name=BFS \
--kernel taskid=0,name=BFS_1_0,metafile=testcase/gpu-rodinia/bfs/BFS_1_0.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_1_0.data \
--kernel taskid=0,name=BFS_2_0,metafile=testcase/gpu-rodinia/bfs/BFS_2_0.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_2_0.data \
--kernel taskid=0,name=BFS_1_1,metafile=testcase/gpu-rodinia/bfs/BFS_1_1.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_1_1.data \
--kernel taskid=0,name=BFS_2_1,metafile=testcase/gpu-rodinia/bfs/BFS_2_1.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_2_1.data \
--kernel taskid=0,name=BFS_1_2,metafile=testcase/gpu-rodinia/bfs/BFS_1_2.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_1_2.data \
--kernel taskid=0,name=BFS_2_2,metafile=testcase/gpu-rodinia/bfs/BFS_2_2.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_2_2.data \
--kernel taskid=0,name=BFS_1_3,metafile=testcase/gpu-rodinia/bfs/BFS_1_3.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_1_3.data \
--kernel taskid=0,name=BFS_2_3,metafile=testcase/gpu-rodinia/bfs/BFS_2_3.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_2_3.data \
--kernel taskid=0,name=BFS_1_4,metafile=testcase/gpu-rodinia/bfs/BFS_1_4.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_1_4.data \
--kernel taskid=0,name=BFS_2_4,metafile=testcase/gpu-rodinia/bfs/BFS_2_4.metadata,datafile=testcase/gpu-rodinia/bfs/BFS_2_4.data \
--task name=GAUSSIAN \
--kernel taskid=1,name=FAN1_0,metafile=testcase/adv_gaussian/Fan1_0.metadata,datafile=testcase/adv_gaussian/Fan1_0.data \
--kernel taskid=1,name=FAN2_0,metafile=testcase/adv_gaussian/Fan2_0.metadata,datafile=testcase/adv_gaussian/Fan2_0.data \
--kernel taskid=1,name=FAN1_1,metafile=testcase/adv_gaussian/Fan1_1.metadata,datafile=testcase/adv_gaussian/Fan1_1.data \
--kernel taskid=1,name=FAN2_1,metafile=testcase/adv_gaussian/Fan2_1.metadata,datafile=testcase/adv_gaussian/Fan2_1.data \
--kernel taskid=1,name=FAN1_2,metafile=testcase/adv_gaussian/Fan1_2.metadata,datafile=testcase/adv_gaussian/Fan1_2.data \
--kernel taskid=1,name=FAN2_2,metafile=testcase/adv_gaussian/Fan2_2.metadata,datafile=testcase/adv_gaussian/Fan2_2.data \
--kernel name=vecadd,metafile=testcase/adv_vecadd/vecadd_1b4w4t.metadata,datafile=testcase/adv_vecadd/vecadd_1b4w4t.data \
--numcycle 150000
##RUNFLAGS = --task name=TNAME \
## --kernel taskid=0,name=vecadd,metafile=testcase/adv_vecadd/vecadd4x4.metadata,datafile=testcase/adv_vecadd/vecadd4x4.data \
## --kernel taskid=0,name=matadd,metafile=testcase/multiblock/matadd/matadd.metadata,datafile=testcase/multiblock/matadd/matadd.data \
## --numcycle 500000
#RUNFLAGS = --kernel name=vecadd,metafile=testcase/adv_vecadd/vecadd_1b4w4t.metadata,datafile=testcase/adv_vecadd/vecadd_1b4w4t.data \
# --numcycle 100000
#RUNFLAGS = --numkernel 2 matadd multiblock/matadd/matadd.metadata multiblock/matadd/matadd.data vecadd adv_vecadd/vecadd4x4.metadata adv_vecadd/vecadd4x4.data --numcycle 500000
RUNFLAGS_tensor484 = --numkernel 1 tensor tensor/wmma484fp32/wmma484fp32.metadata tensor/wmma484fp32/wmma484fp32.data --numcycle 2000
RUNFLAGS_tensor242 = --numkernel 1 tensor tensor/wmma424fp32/wmma424fp32.metadata tensor/wmma424fp32/wmma424fp32.data --numcycle 2000
RUNFLAGS_vectormma = --numkernel 1 tensor tensor/wmma484fp32/vectormma.metadata tensor/wmma484fp32/vectormma.data --numcycle 6000
Expand Down
Loading