Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1126 commits
Select commit Hold shift + click to select a range
db005a8
fix: mig mode-change #1116 (#1124)
ouyangluwei163 Jun 24, 2025
13dd7b9
feat(scheduler-role): use a scoped-down role for scheduler
Antvirf Jun 25, 2025
b617bab
add global image for chart (#1133)
calvin0327 Jun 26, 2025
8d360ad
fix: Skip admission webhook when Pod's scheduler is already assigned …
ghostloda Jun 27, 2025
bbe6008
fix: Multi-node scoring nodes are inaccurate (#1147)
ouyangluwei163 Jun 30, 2025
51fefad
fix: An error occurred while create Iluvatar pod (#1149)
ouyangluwei163 Jul 1, 2025
08a818b
feat(scheduler-roles): use built-in system:kube-scheduler and volume-…
Antvirf Jul 1, 2025
e6468dd
feat(scheduler-roles): remove unneeded update perm
Antvirf Jul 1, 2025
e50b913
Merge pull request #1066 from Shouren/feat/add-new-labels-in-release-…
hami-robot[bot] Jul 3, 2025
5634da2
Merge pull request #1152 from Antvirf/feat/minimize-role-perms
hami-robot[bot] Jul 3, 2025
5930d95
clearly list supported devices doc entry
FouoF Jun 17, 2025
f3548e1
build(deps): bump aquasecurity/trivy-action from 0.31.0 to 0.32.0 (#1…
dependabot[bot] Jul 7, 2025
b33f5b4
feat(helm): optionally disable admission webhook (#1145)
Antvirf Jul 7, 2025
407ff3b
Add node config (#1159)
wylswz Jul 7, 2025
0756fa8
Fix e2e CI (#1165)
archlitchi Jul 7, 2025
cbcd55e
fix: Add option for overwrite schedulerName (#1163)
Shouren Jul 7, 2025
493fc99
build(deps): upgrade golang to 1.24.4 (#1172)
Shouren Jul 9, 2025
e1d43b9
build(deps): upgrade golang image in ci to 1.24.4 (#1176)
Shouren Jul 9, 2025
24986ad
build(deps): Upgrade controller-runtime to 0.21.0 (#1171)
Shouren Jul 10, 2025
c77d8b3
build(deps): dump github.com/NVIDIA/nvidia-container-toolkit from 1.1…
Shouren Jul 10, 2025
2122e64
fix: using go-safecast to fix incorrect conversion of numbers (#1183)
Shouren Jul 10, 2025
4fffa54
fix: deal with security issues reported by Trivy in image (#1189)
Shouren Jul 14, 2025
0b48f3c
refactor metrics for container gpu allocation (#1169)
FouoF Jul 14, 2025
707201b
Add unit tests for Fit Function for enflame,hygon, metax, mthreads, n…
Jul 13, 2025
2ce351c
[Misc] update hami-core version (#1201)
chaunceyjiang Jul 15, 2025
b2ab9b4
update config (#1158)
FouoF Jul 15, 2025
f432548
fix: wrong Pod's UID and emtpy Pod's name in log of webhook (#1092)
Shouren Jul 15, 2025
c301f9b
Add unit tests for cambricon's Fit Function (#1198)
Wangmin362 Jul 15, 2025
5dd402f
Add Ascend unit tests, mainly for Fit Function (#1197)
Wangmin362 Jul 15, 2025
be2f28a
build(deps): bump github.com/fsnotify/fsnotify from 1.7.0 to 1.9.0 (#…
dependabot[bot] Jul 15, 2025
79cd943
修复生成 pod 请求资源时不必要的重复计算 (#1215)
litaixun Jul 16, 2025
56ffbcb
refactor: clean up code and improve maintainability (#1195)
Wangmin362 Jul 16, 2025
acc9863
修复更新节点注解时的日志提示词 (#1214)
litaixun Jul 16, 2025
004fb7a
Mig supports applying for the same Mem (#1179)
zgqqiang Jul 18, 2025
1166990
build(deps): bump github.com/onsi/gomega from 1.36.1 to 1.37.0 (#1187)
dependabot[bot] Jul 18, 2025
07e72f6
build(deps): bump github.com/spf13/cobra from 1.8.1 to 1.9.1 (#1212)
dependabot[bot] Jul 18, 2025
0f305c6
build(deps): bump google.golang.org/grpc from 1.73.0 to 1.74.0 (#1220)
dependabot[bot] Jul 18, 2025
d6a485c
build(deps): bump golang.org/x/tools from 0.33.0 to 0.35.0 (#1194)
dependabot[bot] Jul 18, 2025
2d3bff3
updated dri section to combine text for better readability (#1216)
mpetason Jul 18, 2025
bd8aff5
feat: Add nvidia gpu topoloy scheduler (#1028)
fyp711 Jul 18, 2025
006ca44
build(deps): bump github.com/spf13/pflag from 1.0.6 to 1.0.7 (#1223)
dependabot[bot] Jul 21, 2025
92c9bab
Support Metax sGPU topology aware (#1193)
Kyrie336 Jul 23, 2025
78c3935
build(deps): bump github.com/onsi/gomega from 1.37.0 to 1.38.0 (#1226)
dependabot[bot] Jul 25, 2025
bd0613c
build(deps): bump google.golang.org/grpc from 1.74.0 to 1.74.2 (#1227)
dependabot[bot] Jul 25, 2025
0d42c52
add issue translate robot
wawa0210 Jul 29, 2025
ffe9719
add issue translate robot
wawa0210 Jul 29, 2025
02e9a71
perf(util/nodelock): Use clientset Patch instead of Update.
Jul 11, 2025
2320018
feat(chart): Delete update verbs from the node resources
Jul 14, 2025
d36498e
refactor: strings.SplitSeq is introduced since Go 1.24.0 and ranging …
Shouren Jul 31, 2025
c81cd33
Update hami-core and fix readme documents (#1240)
archlitchi Jul 31, 2025
3df4666
build(deps): bump gotest.tools/v3 from 3.5.1 to 3.5.2 (#1185)
dependabot[bot] Jul 31, 2025
2ccf412
Support AWS-neuron device and device-core allocation (#1238)
archlitchi Aug 1, 2025
f0b9834
Update hami-core version to fix (#1256)
archlitchi Aug 4, 2025
46af602
build(deps): bump docker/login-action from 3.4.0 to 3.5.0 (#1257)
dependabot[bot] Aug 5, 2025
2f75b4a
build(deps): bump github.com/NVIDIA/go-nvlib from 0.7.3 to 0.7.4 (#1253)
dependabot[bot] Aug 5, 2025
d2cef08
feat:Update values.yaml nodeLockExpire (#1244)
miaobyte Aug 6, 2025
a27cc13
build(deps): bump github.com/NVIDIA/nvidia-container-toolkit (#1246)
dependabot[bot] Aug 11, 2025
6005020
build(deps): bump actions/download-artifact from 4 to 5 (#1259)
dependabot[bot] Aug 11, 2025
338d823
build(deps): bump google.golang.org/protobuf from 1.36.6 to 1.36.7
dependabot[bot] Aug 7, 2025
7188c99
build(deps): bump golang.org/x/term from 0.33.0 to 0.34.0
dependabot[bot] Aug 7, 2025
5c3397a
build(deps): bump github.com/prometheus/client_golang (#1247)
dependabot[bot] Aug 11, 2025
bc5dc02
feat:mv watchAndFeedback (#1248)
miaobyte Aug 11, 2025
a0c95fc
fix: concurrent map writes error in scheduler.calcScore #1269 (#1270)
Shouren Aug 13, 2025
172b8dc
build(deps): bump actions/checkout from 4 to 5 (#1265)
dependabot[bot] Aug 13, 2025
0aa7bb8
fix: benchmarks/ai-benchmark/Dockerfile to reduce vulnerabilities (#1…
wawa0210 Aug 13, 2025
713ebf9
build(deps): bump github.com/NVIDIA/k8s-device-plugin (#1228)
dependabot[bot] Aug 15, 2025
5d8d1f8
feat: use informer/lister to reduce API server load (#1250)
miaobyte Aug 15, 2025
6e77919
build(deps): bump golang.org/x/tools from 0.35.0 to 0.36.0
dependabot[bot] Aug 12, 2025
15fca45
feat: Add option to disable device plugin at values.yaml. (#1274)
FouoF Aug 18, 2025
6900c58
perf(util/nodelock): use clientset Patch instead of Update (#1252)
Aug 18, 2025
f47cb05
fix: release dangling node lock (#1271)
peachest Aug 19, 2025
2ca155d
feat: Add an action of 'Close stale issue and PRs' in github worklfow…
Shouren Aug 21, 2025
dceded0
fix: fix err which retrieved incorrect NUMA node information issue #1…
abstractmj Aug 21, 2025
6d86882
build(deps): bump google.golang.org/protobuf from 1.36.7 to 1.36.8 (#…
dependabot[bot] Aug 25, 2025
c6465d7
Welcome fyp711 to become a HAMi member (#1288)
wawa0210 Aug 25, 2025
af1bc68
fix(security): resolve 868 869 870 871 issues in Code scanning (#1280)
Shouren Aug 25, 2025
9c5e939
Add values readme (#1267)
clcc2019 Aug 25, 2025
4f25ec1
build(deps): bump google.golang.org/grpc from 1.74.2 to 1.75.0 (#1285)
dependabot[bot] Aug 26, 2025
48ccab2
Support Metax sGPU device health check (#1295)
Kyrie336 Aug 26, 2025
61c6cbe
Optimize pkg/util.go and distribute logics to corresponding logics (…
archlitchi Sep 1, 2025
d959165
fix: fix golangci-lint error (#1319)
DSFans2014 Sep 1, 2025
9a10cbb
Fix: device allocation missing containers with no device request (#1299)
FouoF Sep 2, 2025
0448891
cleanup: Clear and correct ascend device name (#1315)
FouoF Sep 2, 2025
bb9c316
docs: update ascend910b-support docs (#1321)
DSFans2014 Sep 3, 2025
15bfa67
build(deps): bump actions/stale from 9 to 10 (#1325)
dependabot[bot] Sep 9, 2025
93d45fc
build(deps): bump aquasecurity/trivy-action from 0.32.0 to 0.33.1 (#1…
dependabot[bot] Sep 10, 2025
00ea0c7
build(deps): bump github.com/stretchr/testify from 1.10.0 to 1.11.1 (…
dependabot[bot] Sep 10, 2025
66284e4
refactor: Remove annotation in Devices interfaces. (#1343)
Shouren Sep 12, 2025
143c99c
bugfix: Nvidia card abnormal pod will still continue to schedule (#1336)
zgqqiang Sep 12, 2025
c8dcf6a
build(deps): bump golang.org/x/net from 0.43.0 to 0.44.0 (#1335)
dependabot[bot] Sep 12, 2025
8a5a6f4
doc: support kunlun vxpu (#1338)
ouyangluwei163 Sep 12, 2025
54106c1
feat: Support Enflame GCU DevicePlugin (#1040) (#1334)
zhaikangqi331 Sep 12, 2025
bf5b858
feat: vxpu sopport #1016 (#1337)
ouyangluwei163 Sep 12, 2025
d77d0eb
Aggregated Scheduling Failure Events (#1333)
Wangmin362 Sep 12, 2025
0fa3c8f
FIx CI, add 910B4-1 template and fix vGPUmonitor metrics error (#1345)
archlitchi Sep 14, 2025
6e739ee
build(deps): bump github.com/spf13/cobra from 1.9.1 to 1.10.1 (#1347)
dependabot[bot] Sep 15, 2025
aa4989f
fix: update int8Slice to uint8Slice for better type clarity and consi…
yxxhero Sep 22, 2025
470e284
add httpTargetPort to values.yaml (#1356)
flpanbin Sep 23, 2025
7668eec
feat: update the `Ascend910` scheduling policy (#1344)
DSFans2014 Sep 23, 2025
2430e90
Update kunlunxin documents (#1366)
archlitchi Sep 26, 2025
52c086c
feat: add NVIDIA Resourcequota (#1359)
FouoF Sep 26, 2025
f95b709
feat(nvidia): default gpucores=100 when memory is exclusive and cores…
xrwang8 Sep 26, 2025
3324e9a
update chart version and hami-core (#1369)
archlitchi Sep 26, 2025
db13f8d
update release ci (#1373)
archlitchi Sep 26, 2025
f70bbfa
build(deps): bump github.com/NVIDIA/go-nvlib from 0.8.0 to 0.8.1 (#1375)
dependabot[bot] Sep 29, 2025
b0a39f6
build(deps): bump cpina/github-action-push-to-another-repository (#1374)
dependabot[bot] Sep 30, 2025
b7b4e1e
fix failed rolebinding (#1380)
FouoF Sep 30, 2025
f0dd439
build(deps): bump github/codeql-action from 3 to 4 (#1387)
dependabot[bot] Oct 9, 2025
14aecf6
fix e2e ginkgo version mismatch (#1391)
FouoF Oct 10, 2025
03b8cb0
add reviwer (#1390)
FouoF Oct 10, 2025
7ceb558
fix: check pod nil in `ReleaseNodeLock` (#1372)
DSFans2014 Oct 11, 2025
de9d26e
build(deps): bump golang.org/x/term from 0.35.0 to 0.36.0 (#1396)
dependabot[bot] Oct 11, 2025
324e612
add podInfos in DeviceUsage to enhance scheduling decision (#1362)
Kyrie336 Oct 11, 2025
97ff486
fix: upgrade nvidia-mig-parted to v0.12.2 to solve security issues (#…
Shouren Oct 11, 2025
026b13a
Fix: Remove usage of quotas when pod is terminated and delete quota w…
luohua13 Oct 13, 2025
801b2cf
build(deps): bump golang.org/x/tools from 0.36.0 to 0.38.0 (#1392)
dependabot[bot] Oct 13, 2025
3849fe8
update (#1403)
archlitchi Oct 13, 2025
f13abdd
build(deps): bump github.com/spf13/pflag from 1.0.9 to 1.0.10 (#1406)
dependabot[bot] Oct 14, 2025
8fbec81
Improved support for Iulvatar GPUs (#1399)
qiangwei1983 Oct 14, 2025
78226e0
fix: scheduler flaky test (#1402)
FouoF Oct 15, 2025
e3364d6
docs: add Japanese README (#1412)
eltociear Oct 21, 2025
9d6f379
build(deps): bump google.golang.org/grpc from 1.75.0 to 1.76.0 (#1407)
dependabot[bot] Oct 21, 2025
dd553dd
make node stable (#1411)
FouoF Oct 21, 2025
9cbaff5
docs: Add AI assistance transparency policy in CONTRIBUTING.md (#1416)
Shouren Oct 22, 2025
c5ac9eb
build(deps): bump github.com/ccoveille/go-safecast from 1.6.1 to 1.7.…
dependabot[bot] Oct 22, 2025
cfc37bb
feat: optimize nodelock with per-node fine-grained locking (#1395)
xrwang8 Oct 22, 2025
b27a042
build(deps): bump github.com/NVIDIA/k8s-device-plugin (#1426)
dependabot[bot] Oct 23, 2025
12e6567
refactor: simplify configuration validation logic (#1440)
yxxhero Oct 28, 2025
c28de75
feat: add GeneratePodNamespaceName function and corresponding tests (…
yxxhero Oct 29, 2025
5321f50
build(deps): bump actions/upload-artifact from 4 to 5 (#1442)
dependabot[bot] Oct 29, 2025
97088f5
build(deps): bump actions/download-artifact from 5 to 6 (#1443)
dependabot[bot] Oct 29, 2025
d205a9f
Improve: Replace `StrategicMergePatchType` by `MergePatchType` (#1431)
luohua13 Oct 29, 2025
97c0b17
Support Metax sGPU App Class (#1436)
Kyrie336 Oct 29, 2025
71509c8
optimize schedule failure event (#1444)
Kyrie336 Oct 30, 2025
e2835e8
add_amd_whole_card
archlitchi Oct 31, 2025
8da0cca
update
archlitchi Oct 31, 2025
c7071fd
Fix: After removing the device plugin from the gpu node, it can still…
luohua13 Nov 3, 2025
b96fdbc
Fix concurrent map iteration and map write fatal error. (#1452)
litaixun Nov 3, 2025
66e6f5a
docs: update monitor (#1465)
daixiang0 Nov 4, 2025
1b7ca59
fix https://github.com/Project-HAMi/HAMi/pull/1460 (#1466)
wawa0210 Nov 4, 2025
fa57226
fix: fix typos (#1434)
DSFans2014 Nov 4, 2025
8bae006
build(deps): bump docker/login-action from 3.5.0 to 3.6.0 (#1383)
dependabot[bot] Nov 4, 2025
c646d81
Merge branch 'master' of github.com:archlitchi/HAMi
archlitchi Nov 5, 2025
a9b3078
update
archlitchi Nov 5, 2025
5a15ab3
update
archlitchi Nov 5, 2025
8801868
Add AMDMi300x device monitoring & fix CI (#1472)
archlitchi Nov 5, 2025
8cc1431
build(deps): bump github.com/ccoveille/go-safecast from 1.8.0 to 1.8.…
dependabot[bot] Nov 5, 2025
1b69f97
upgrade promethues api metric label,add devicetype label (#1419)
zhegemingzimeibanquan Nov 5, 2025
1edf64f
Merge branch 'Project-HAMi:master' into master
archlitchi Nov 5, 2025
de28227
update
archlitchi Nov 5, 2025
ddfda03
Fix CI error of the PR #1470, #1326, #1033 (#1473)
archlitchi Nov 6, 2025
04efe02
build(deps): bump actions/setup-go from 5 to 6 (#1326)
dependabot[bot] Nov 6, 2025
1044184
build(deps): bump golangci/golangci-lint-action from 7 to 8 (#1033)
dependabot[bot] Nov 6, 2025
048ff52
update
archlitchi Nov 6, 2025
5d10fc2
update
archlitchi Nov 6, 2025
ef89f4a
Merge branch 'Project-HAMi:master' into master
archlitchi Nov 6, 2025
ce0a434
Update HAMi-core to fix vllm-related issues: #1381 # 1461 (#1478)
archlitchi Nov 6, 2025
ea2e16c
Fix concurrent map read write fatal error. (#1476)
litaixun Nov 6, 2025
dc80df4
update
archlitchi Nov 7, 2025
059dd7c
Merge branch 'master' of github.com:archlitchi/HAMi
archlitchi Nov 7, 2025
ea3dcd4
Merge branch 'Project-HAMi:master' into master
archlitchi Nov 7, 2025
990692c
Merge branch 'master' of github.com:archlitchi/HAMi
archlitchi Nov 7, 2025
eb17b31
update
archlitchi Nov 7, 2025
87361df
update
archlitchi Nov 7, 2025
5cbb5bb
Release v2.7.1 (#1480)
archlitchi Nov 7, 2025
5125fd6
fix ci
archlitchi Nov 7, 2025
1adf148
Merge branch 'master' of github.com:Project-HAMi/HAMi
archlitchi Nov 7, 2025
569c405
remove redundancy model (#1487)
FouoF Nov 12, 2025
b5d21f9
build(deps): bump golangci/golangci-lint-action from 8 to 9 (#1484)
dependabot[bot] Nov 12, 2025
5e38dc3
build(deps): bump golang.org/x/term from 0.36.0 to 0.37.0 (#1489)
dependabot[bot] Nov 13, 2025
ca03b9f
build(deps): bump golang.org/x/net from 0.46.0 to 0.47.0 (#1490)
dependabot[bot] Nov 17, 2025
d4942e9
docs: add docs for using ascend device in volcano (#1488)
DSFans2014 Nov 17, 2025
d0d0211
update memory allocation of Ascend910B4-1 vnpu template (#1498)
peachest Nov 18, 2025
3c31dbe
build(deps): bump golang.org/x/tools from 0.38.0 to 0.39.0 (#1492)
dependabot[bot] Nov 18, 2025
fb8bd18
build(deps): bump google.golang.org/grpc from 1.76.0 to 1.77.0 (#1501)
dependabot[bot] Nov 20, 2025
6ef76b0
refactor: add minor optimizations and fix some logs about `ResourceQu…
DSFans2014 Nov 20, 2025
7bb4584
build(deps): bump actions/checkout from 5 to 6 (#1503)
dependabot[bot] Nov 21, 2025
0c4d3cf
Add Shouren to Approvers (#1504)
Shouren Nov 26, 2025
ae991f4
Refine Node Register logic (#1499)
archlitchi Nov 26, 2025
b0b8913
nvidia model duplicate NVIDIA prefix (#1139)
lengrongfu Nov 27, 2025
e54a605
Helm improvements and fixes (#1507)
clarifai-fmarceau Dec 2, 2025
97e761e
remote unused oci folder
archlitchi Dec 3, 2025
ffedfef
remote unused oci folder (#1514)
archlitchi Dec 4, 2025
c5ccf61
update go version
archlitchi Dec 4, 2025
5d6e24f
Merge branch 'Project-HAMi:master' into master
archlitchi Dec 4, 2025
71a9f87
update go version
archlitchi Dec 4, 2025
d396eb3
Merge branch 'master' of github.com:archlitchi/HAMi
archlitchi Dec 4, 2025
1988ace
update go version
archlitchi Dec 4, 2025
145af6e
update go version
archlitchi Dec 4, 2025
a5411ab
Merge pull request #1517 from archlitchi/master
Shouren Dec 5, 2025
8f2926a
promote dynamic mig (#1519)
FouoF Dec 8, 2025
86e63a1
feat: update helm chart version to 2.7.1 (#1493)
ooninoo Dec 8, 2025
26c45e6
build(deps): bump github.com/NVIDIA/nvidia-container-toolkit from 1.1…
dependabot[bot] Dec 8, 2025
f12c580
build(deps): bump github.com/NVIDIA/k8s-device-plugin (#1515)
dependabot[bot] Dec 8, 2025
329dab8
build(deps): bump github.com/onsi/ginkgo/v2 from 2.23.4 to 2.27.2 (#1…
dependabot[bot] Dec 8, 2025
9b6d73e
build(deps): bump github.com/prometheus/client_golang from 1.23.0 to …
dependabot[bot] Dec 8, 2025
bba3ecd
fix mig schedule (#1518)
FouoF Dec 8, 2025
3b492b1
build(deps): bump github.com/NVIDIA/go-nvlib from 0.8.1 to 0.9.0 (#1523)
dependabot[bot] Dec 9, 2025
52eb7ff
build(deps): bump github.com/onsi/gomega from 1.38.2 to 1.38.3 (#1524)
dependabot[bot] Dec 9, 2025
1e0e74f
build(deps): bump github.com/spf13/cobra from 1.10.1 to 1.10.2 (#1525)
dependabot[bot] Dec 9, 2025
43e4529
build(deps): bump github.com/onsi/ginkgo/v2 from 2.27.2 to 2.27.3 (#1…
dependabot[bot] Dec 9, 2025
9a5bd8b
fix: add pod tombstone handling when missing delete events
Dec 5, 2025
ad0dfb0
update: maintain consistent coding style
Dec 5, 2025
f82aa75
fix: retrun after recevied unknown object type on pod delete
Dec 8, 2025
e9140fb
feat: wait until resource state is synchronized
DSFans2014 Dec 3, 2025
3ac26fc
build(deps): bump golang.org/x/net from 0.47.0 to 0.48.0 (#1529)
dependabot[bot] Dec 10, 2025
29e3c7b
build(deps): bump golang.org/x/tools from 0.39.0 to 0.40.0 (#1531)
dependabot[bot] Dec 11, 2025
286298d
build(deps): bump tags.cncf.io/container-device-interface (#1533)
dependabot[bot] Dec 11, 2025
1f2c1ad
Optimize the issue where configmap hami-device-plugin is overwritten …
leolingg Dec 15, 2025
e355016
build(deps): bump actions/upload-artifact from 5 to 6 (#1539)
dependabot[bot] Dec 16, 2025
cb878a6
build(deps): bump actions/download-artifact from 6 to 7 (#1540)
dependabot[bot] Dec 16, 2025
4990db0
build: Upgrade base image to nvidia/cuda:12.6.3-base-ubuntu22.04 and …
Shouren Dec 16, 2025
1b39ea9
Sync with k8s-device-plugin from nvidia v0.18.0 (#1541)
archlitchi Dec 18, 2025
4673563
fix: change field name from nvidianodeSelector to nvidiaNodeSelector …
kaiiyvwu Dec 22, 2025
32ec9a5
Add a LockNode mechanism for Iluvatar devices (#1547)
qiangwei1983 Dec 23, 2025
554099a
Add CDI-related configurations (#1552)
archlitchi Dec 23, 2025
98aa4ba
remove repeated var (#1550)
googs1025 Dec 23, 2025
519f3ed
feat: install `mock-device-plugin` (#1534)
DSFans2014 Dec 23, 2025
538e55d
build(deps): bump google.golang.org/grpc from 1.77.0 to 1.78.0 (#1556)
dependabot[bot] Dec 24, 2025
0158a50
fix device plugin nodeConfiguration template (#1558)
FouoF Dec 25, 2025
8aeb9a3
fix: resolve multi-node GPU count tracking in nvidia device plugin (#…
pepesi Dec 26, 2025
5b79f41
Remove xpu-device-plugin argument (#1554)
121812 Dec 26, 2025
079fa14
Add CDI-related documents in config (#1559)
archlitchi Dec 29, 2025
f3ec017
fix(nodelock): improve parsing of lock annotation (#1560)
Mirza-Samad-Ahmed-Baig Dec 30, 2025
b4ca659
feat: add dra installation option in helm chart (#1542)
FouoF Jan 4, 2026
86307cf
clean deprecated metrics (#1562)
FouoF Jan 4, 2026
436d3da
docs(DCU document update): Improve the relevant documentation for DCU…
zqwangadv Jan 5, 2026
8a36663
Replace slack with discord in README (#1567)
archlitchi Jan 5, 2026
b81987a
Fix kunlunxin vxpu issue (#1569)
archlitchi Jan 7, 2026
6b57f36
feat: update mock-device-plugin version (#1570)
DSFans2014 Jan 9, 2026
ea8dbf3
Implementing leader-follower high availability using leader-election …
peachest Jan 9, 2026
2d7bc49
update Makefile for helm package (#1574)
archlitchi Jan 9, 2026
64a2088
Watch and hot reload the updated certificate (#1573)
peachest Jan 12, 2026
0dbb92f
add DSFans2014 to reviewers (#1575)
DSFans2014 Jan 12, 2026
2acd27d
Added concurrent tests for ListNodes() (#1576)
IsQiao Jan 12, 2026
91cb29e
fix: resource quota cannot limit the resource request properly when …
DSFans2014 Jan 14, 2026
8a06be3
update dra version (#1580)
FouoF Jan 14, 2026
01d3704
Add modernize check (#1578)
dongjiang1989 Jan 14, 2026
3caf6ce
build(deps): bump github.com/onsi/ginkgo/v2 from 2.27.3 to 2.27.5 (#1…
dependabot[bot] Jan 15, 2026
bc2a918
build(deps): bump golang.org/x/term from 0.38.0 to 0.39.0 (#1586)
dependabot[bot] Jan 15, 2026
5bdd498
Fix discord invitation expired (#1582)
archlitchi Jan 15, 2026
fdbfd12
build(deps): bump golang.org/x/net from 0.48.0 to 0.49.0 (#1587)
dependabot[bot] Jan 16, 2026
e3ef605
build(deps): bump github.com/onsi/gomega from 1.38.3 to 1.39.0 (#1585)
dependabot[bot] Jan 16, 2026
037776a
build(deps): bump github.com/sirupsen/logrus from 1.9.3 to 1.9.4 (#1592)
dependabot[bot] Jan 16, 2026
f1d9805
build(deps): bump golang.org/x/tools from 0.40.0 to 0.41.0 (#1588)
dependabot[bot] Jan 16, 2026
9a034b9
clean code (#1593)
DSFans2014 Jan 16, 2026
e4825c3
feat: Add hami_build_info metrics and version print (#1581)
dongjiang1989 Jan 19, 2026
0f199da
fix: clean up metrics data when node is deleted (#1597)
xiyichan Jan 19, 2026
ef0d42b
Release v2.8.0 (#1598)
archlitchi Jan 19, 2026
42199d7
add promtheus podmonitor
dongjiang1989 Jan 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
33 changes: 33 additions & 0 deletions .github/ISSUE_TEMPLATE/bug-report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
name: Bug Report
about: Report a bug encountered while using HAMi.
labels: kind/bug

---

<!-- Please use this template while reporting a bug and provide as much info as possible. Not doing so may result in your bug not being addressed in a timely manner. Thanks!
-->

**What happened**:

**What you expected to happen**:

**How to reproduce it (as minimally and precisely as possible)**:

**Anything else we need to know?**:

- The output of `nvidia-smi -a` on your host
- Your docker or containerd configuration file (e.g: `/etc/docker/daemon.json`)
- The hami-device-plugin container logs
- The hami-scheduler container logs
- The kubelet logs on the node (e.g: `sudo journalctl -r -u kubelet`)
- Any relevant kernel output lines from `dmesg`

**Environment**:
- HAMi version:
- nvidia driver or other AI device driver version:
- Docker version from `docker version`
- Docker command, image and tag used
- Kernel version from `uname -a`
- Others:

4 changes: 4 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
contact_links:
- name: FAQ
url: https://github.com/Project-HAMi/HAMi/issues/646
about: Frequently asked questions and common solutions.
22 changes: 22 additions & 0 deletions .github/ISSUE_TEMPLATE/enhancement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Enhancement Request
about: Suggest an enhancement to the project
labels: kind/feature

---
<!-- Please only use this template for submitting enhancement requests -->

**What would you like to be added**:

**What type of PR is this?**

/kind feature

**What this PR does / why we need it**:

**Which issue(s) this PR fixes**:
Fixes #

**Special notes for your reviewer**:

**Does this PR introduce a user-facing change?**:
31 changes: 31 additions & 0 deletions .github/ISSUE_TEMPLATE/good-first.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
name: Good First Issue
about: Publish a good first issue
labels: good first issue

---

<!-- Please use this template while publishing a good first issue. Thanks!
-->

**Task description**:

**Solution**:

**Who can join or take the task**:

The good first issue is intended for `first-time contributors` to get started on his/her contributor journey.

After a contributor has successfully completed 1-2 good first issue's,
they should be ready to move on to `help wanted` items, saving the remaining `good first issue` for other new contributors.

**How to join or take the task**:

Just reply on the issue with the message `/assign` in a separate line.

Then, the issue will be assigned to you.

**How to ask for help**:

If you need help or have questions, please feel free to ask on this issue.
The issue author or other members of the community will guide you through the contribution process.
15 changes: 15 additions & 0 deletions .github/ISSUE_TEMPLATE/question.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: Question
about: Question relating to HAMi.
labels: kind/question

---

**Please provide an in-depth description of the question you have**:

**What do you think about this question?**:

**Environment**:
- HAMi version:
- Kubernetes version:
- Others:
22 changes: 22 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
**What type of PR is this?**

<!--
Add one of the following kinds:
/kind bug
/kind cleanup
/kind deprecation
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake
-->

**What this PR does / why we need it**:

**Which issue(s) this PR fixes**:
Fixes #

**Special notes for your reviewer**:

**Does this PR introduce a user-facing change?**:
21 changes: 21 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates


version: 2
updates:
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "daily"
- package-ecosystem: "docker"
directory: "/docker"
schedule:
interval: "daily"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
10 changes: 10 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"kind/bug":
- '^[Ff]ix(\(.*\))?:?.*'
"kind/cleanup":
- '^[Cc]hore(\(.*\))?:?.*'
"kind/documentation":
- '^[Dd]ocs?(\(.*\))?:?.*'
"kind/enhancement":
- '^[Rr]efactor(\(.*\))?:?.*'
"kind/feature":
- '^[Ff]eat(\(.*\))?:?.*'
34 changes: 34 additions & 0 deletions .github/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# .github/release.yml
changelog:
exclude:
labels:
- ignore-for-release
- github-actions
authors:
- dependabot[bot]
categories:
- title: ✨ New Features
labels:
- feature
- design
- enhancement
- kind/feature
- kind/design
- kind/enhancement
- title: 🐛 Bug Fixes
labels:
- bug
- kind/bug
- title: 📚 Documentation
labels:
- documentation
- kind/documentation
- title: ⬆️ Dependencies
labels:
- dependencies
- title: 💥 Breaking Changes
labels:
- breaking-change
- title: 🔨 Other Changes
labels:
- "*"
22 changes: 22 additions & 0 deletions .github/workflows/auto-label-pr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: "PR Labeler"
on:
pull_request_target:
types: [opened, edited]

permissions:
issues: write
pull-requests: write
contents: read

jobs:
labeling:
runs-on: ubuntu-latest
steps:
- uses: github/[email protected]
with:
configuration-path: .github/labeler.yml
enable-versioned-regex: 0
sync-labels: 1
include-title: 1
include-body: 0
repo-token: ${{ github.token }}
152 changes: 152 additions & 0 deletions .github/workflows/auto-release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
name: 1.Release

# it is trigger by tag event:
# 1 build the release image, push images to ghcr.io, and build image like: ghcr.io/xxxx:v1.0.0
# 2 package the chart package, update index.yaml and commit to '/charts' of branch 'github_pages' ( PR with label pr/release/robot_update_githubpage )
# 3 create changelog file, commit to '/changelogs' of branch 'github_pages' for githubPage ( PR with label pr/release/robot_update_githubpage )
# 4 commit '/docs' to '/docs' of branch 'github_pages'
# 5 create a release , attached with the chart and changelog

on:
release:
types:
- prereleased
push:
tags:
- v[0-9]+.[0-9]+.[0-9]+
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
workflow_dispatch:
inputs:
tag:
description: 'Tag'
required: true
default: v1.0.0
permissions: write-all

jobs:
pre-release:
runs-on: ubuntu-latest
permissions:
contents: write
outputs:
tag: ${{ env.RUN_TAG }}
steps:
- name: Check Version
run: |
TagVersion="${{ env.RUN_TAG }}"
RecordVersion=` cat VERSION | tr -d ' ' | tr -d '\n' `
if [ "$RecordVersion" != "$TagVersion" ] ; then
echo "error, version $RecordVersion of '/VERSION' is different with Tag $TagVersion "
exit 1
fi
#no need to check chart version, which will auto update to /VERSION by CI
# generate release notes for the new release branch
- name: Generate pre release notes
if: startsWith(github.ref, 'refs/tags/')
uses: softprops/action-gh-release@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
prerelease: true
generate_release_notes: true

ensure-tag:
runs-on: ubuntu-latest
needs: pre-release
outputs:
tag: ${{ env.RUN_TAG }}
steps:
- name: Free disk space
# https://github.com/actions/virtual-environments/issues/709
run: |
echo "=========original CI disk space"
df -h
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
echo "=========after clean up, the left CI disk space"
df -h
- name: Get Ref
id: get_ref
run: |
if ${{ github.event_name == 'workflow_dispatch' }} ; then
echo "call by self workflow_dispatch"
echo "RUN_TAG=${{ github.event.inputs.tag }}" >> $GITHUB_ENV
YBranchName=` grep -Eo "v[0-9]+\.[0-9]+" <<< "${{ github.event.inputs.tag }}" `
elif ${{ github.event_name == 'push' }} ; then
echo "call by push tag"
echo "RUN_TAG=${GITHUB_REF##*/}" >> $GITHUB_ENV
YBranchName=` grep -Eo "v[0-9]+\.[0-9]+" <<< "${GITHUB_REF##*/}" `
else
echo "unexpected event: ${{ github.event_name }}"
exit 1
fi
echo "YBranchName=${YBranchName}"
if [ -n "$YBranchName" ] ; then
echo "RUN_YBranchName=${YBranchName}" >> $GITHUB_ENV
else
echo "error, failed to find y branch"
exit 1
fi
- name: Checkout
uses: actions/checkout@v6
with:
ref: ${{ env.RUN_TAG }}
# if branch exists, the action will no fail, and it output created=false
- name: release Y branch
uses: peterjgrainger/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
branch: 'release-${{ env.RUN_YBranchName }}'
sha: '${{ github.sha }}'


release-image-hamicore:
needs: ensure-tag
uses: ./.github/workflows/call-release-image-hamicore.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}
secrets: inherit

release-image:
needs: [ensure-tag,release-image-hamicore]
uses: ./.github/workflows/call-release-image.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}
secrets: inherit

release-chart:
needs: [ensure-tag]
uses: ./.github/workflows/call-release-helm.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}
submit: true
secrets: inherit

# generate changelog and update to github releases pages
release-notes:
needs: [release-chart,release-image]
uses: ./.github/workflows/call-release-notes.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}

release-website:
needs: [release-notes]
uses: ./.github/workflows/call-release-website.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}
# excute a full e2e test when hami release
release-e2e:
needs: [release-notes]
uses: ./.github/workflows/call-e2e.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}
type: "release"

# excute a compatibility test when hami release
release-e2e-upgrade:
needs: [release-notes]
uses: ./.github/workflows/call-e2e-upgrade.yaml
with:
ref: ${{ needs.ensure-tag.outputs.tag }}

Loading