Releases: intel/cri-resource-manager
v0.6.0: update dependencies, minor bugfixes.
This release brings dependencies up to date with recent versions. It contains a small number of functional improvements and fixes, and a large number of fixes and other improvements to the end-to-end tests.
Major Changes
- build:
- update K8s dependencies to v1.22.2
- bump golang version to v1.16
- fixes and improvements:
- container cgroup directory discovery fixes
- RDT pod QoS class discovery fixes in discovery mode
- agent configuration: authorize access to adjustments
- clean up cgroup and group control abstraction
- remove SST code and pull it in from goresctrl
- end-to-end test framework
- new distributions: sles, opensuse-tumbleweed, ubuntu-21.04
- installing and debugging locally built CRI-O, containerd and runc
- configurable CRI runtime pipe and Kubernetes version
Other improvements
- testing, demos:
- end-to-end tests: a large number of end-to-end test fixes and other test infra improvements
- blockio demo: fix detecting already installed cri-resmgr
- blockio demo: always drop caches before measuring blockio speed
List of Merged PRs
- PR #731: e2e: more robust coldstart test
- PR #730: 0.6.0 release preparation: always try to enable 'SystemdCgroup = true' for tests with containerd.
- PR #728: 0.6.0 release preparation: use distinctive VM names for packaging tests.
- PR #729: 0.6.0 release preparation: add support for testing with cross-built distro binaries.
- PR #725: 0.6.0 release preparation: ubuntu-21.04 cross-build and tests.
- PR #727: 0.6.0 release preparation: centos-7 test cluster bootstrapping fixes.
- PR #726: 0.6.0 release preparation: use latest fedora image for cross-build.
- PR #724: 0.6.0 release preparation: update sid image URL.
- PR #721: e2e: add support for distro=ubuntu-21.04
- PR #722: go.mod: update to K8s deps to v1.22.2
- PR #720: Bump to golang v1.16
- PR #719: distro: force non-interactive 'apt-get install'.
- PR #717: Drop travis CI support
- PR #711: Integrate with goresctrl
- PR #715: github: run tests before golanci-lint
- PR #714: control/rdt: fix discovery of pod qos classes in discovery mode
- PR #713: e2e: support distro=opensuse-tumbleweed
- PR #712: e2e: add vm-put-pkg, install a package from host to vm
- PR #710: e2e: fix cloud-init error on distro=debian-sid
- PR #706: e2e: make sure tests have 'pidof' installed on fedora.
- PR #707: e2e: fix sysctl settings that break cilium CNI on Fedora
- PR #703: e2e: support running tests with CRI-O and cri-resmgr in NRI mode
- PR #696: e2e: wait for cloud-init to finish during VM bootstrap.
- PR #701: e2e: fix opensuse cloud-init and handle wrong containerd
- PR #699: e2e: follow HTTP redirects when fetching apt repo keys.
- PR #698: e2e: fix (EOL'd) Ubuntu Groovy image URL.
- PR #697: e2e: allow installing cri-o from distro repos.
- PR #694: scripts: add CRI-O support to kube-cgroups
- PR #656: e2e: add support for k8s=X.Y.Z to set Kubernetes version
- PR #660: docs: fix pkg urls in quick-start instructions
- PR #690: e2e: distro=sles uses official package repositories
- PR #689: e2e: enable reinstalling pretty much everything on VMs
- PR #688: e2e: add support for distro=sles
- PR #657: e2e: add an init container test
- PR #687: edited e2e-test.md
- PR #654: scripts: kube-cgroups prints cgroup entries per pod/container
- PR #685: e2e: improve isolcpus test robustness
- PR #684: e2e: clean up vm after successful reserved-resources test run
- PR #683: e2e: blockio test for k8scri=crio and k8scri=containerd
- PR #682: e2e: support CRI-O, containerd, and containerd + cri-resmgr as NRI
- PR #681: e2e: cri-resource-manager configuration is optional in test suites
- PR #680: e2e: allow templating in test suite variable files
- PR #679: e2e: add function for checking if local binary is out-of-date
- PR #678: e2e: change e2e test framework title
- PR #677: e2e: support annotations in common pod templates
- PR #676: e2e: add vm functions for dlv debugging
- PR #675: e2e: add vm-install-runc
- PR #674: e2e: add vm-put-docker-image to script API
- PR #673: e2e: enable running without govm if VM_IP is set
- PR #672: e2e: fix (remove) empty names from allowed resources printing
- PR #671: e2e: switch k8s install source in opensuse
- PR #670: e2e: fix reinstalling containerd on opensuse
- PR #669: e2e: distro install crio
- PR #668: e2e: distro: enable running fedora with cgroups=v2
- PR #667: e2e: fix error message after installing golang from tar
- PR #666: e2e: always install git-core with golang
- PR #665: e2e: run apt-get install -y with default answers to dpkg
- PR #662: e2e: Fix govm installation documentation
- PR #663: e2e: lib: Use proper locale for bc to work
- PR #661: e2e: require host dependencies jq and pv
- PR #651: Basic edits to docs
- PR #649: e2e: add goresctrl debugging support to "run.sh debug"
- PR #648: blockio demo: fix detecting already installed cri-resmgr
- PR #647: blockio demo: always drop caches before measuring blockio speed
- PR #646: cache: add a directory to findContainerDir search path
- PR #643: docs: a bunch of grammatical and stylistic fixes by DougTW.
- PR #644: e2e: add tests for topology-aware mixed CPU allocations
- PR #645: e2e: test topology-aware allocations with kernel isolcpus set
- PR #642: fixes: fixes for fedora 33
- PR #639: cgroups: add cleaned up cgroup, group control abstraction.
- PR #641: docs: update Pygments requirements
- PR #638: e2e: fix agent installation
- PR #637: cri-resmgr-agent: authorize access to adjustments.
- PR #621: e2e: fuzz topology-aware
v0.5.0: Improved policies, bug fixes, better test coverage.
This release brings general stability and correctness improvements. It merges the memory tiering policy to
the original topology aware one, with a number of important fixes for resource accounting and assignment.
Major Changes
-
policies:
- Add new podpools policy for pod-granularity workload placement
- topology-aware: merge topology-aware and memory tiering policies
- topology-aware: honor CPU reservation/reserved CPU set in configuration
- topology-aware: unify syntax for per container and pod annotated preferences
-
RDT:
- split out RDT manipulation code to a self-contained package, https://github.com/intel/goresctrl
- implement operating modes (Disabled, Discovery, Full)
- add option to disable RDT monitoring
- support L2 cache allocation
-
CPU allocator (used by topology-aware and podpools policies):
- detect CPU priority levels with Intel Speed Select Technology (SST)
Bug Fixes
-
policies:
- topology-aware: several significant cpu and memory accounting fixes
- topology-aware: fixes in gradually relaxed memory pinning for OOM-prevention
- topology-aware: better handling of bounding and reserved resources
- topology-aware: fix assignment of CPU-less memory zones
- topology-aware: fix building sparse topology trees
-
RDT:
- use root class as a fallback for missing classes
- empty class implies root class
- do forceful rdt (re-)configuration
-
resource-manager:
- force full reallocation when switching policies
- run post-update hooks after reconfiguration
- save cache at startup
-
config:
- handle composite structs in Module.validate()
-
cache:
- (over)write cache file atomically
-
testing:
- e2e: fix clearing cri-resmgr cache on uninstall
- e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
-
documentation:
- fix static-pools debug logging instructions
- sample-configs: sample configuration fixes
Other Improvements
-
policies:
- topology-aware: more regular annotation interpretation for CPU allocation preferences
-
resource-manager:
- dump extra data for message disambiguation
- flush logs after every request/event processed
-
cache:
- log name on pod/container removal
-
cri-resmgr:
- increase allowed service journal log bursts
-
logging:
- switch logger to use klog
-
testing:
- e2e: add tests for memset expansion in topology-aware policy
- e2e: add vm-put-docker-image to the vm library
- e2e: allow user override for VM_SSH_USER over distro-ssh-user
- e2e: generalize templating any file with instantiate()
- e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
- e2e: set imagePullPolicy on every test pod
- e2e: support namespaced kubectl create from templates
- e2e: unified memory-type and cold-start annotation syntax
- e2e: update dynamic page demotion tests
- e2e: update podpools tests to pass with new cpuallocator
- e2e: update tests on pinning reserved CPUs
- benchmark: add memtier_benchmark for memcached/redis
-
documentation:
- improve RDT documentation
- fix static-pools debug logging instructions
List of Merged PRs
- PR #528: build: include only cri-resmgr in binary dist tarballs
- PR #529: docs: fix static-pools debug logging instructions
- PR #530: memtier/c4pmem4/test03-coldstart: don't jump the gun
- PR #536: .github: update issue template for new releases
- PR #537: docs: minor fixes in html template customization
- PR #538: docs: use 'release branch' as the current version in versions menu
- PR #540: e2e: support namespaced kubectl create from templates
- PR #541: e2e: fix clearing cri-resmgr cache on uninstall
- PR #542: e2e: generalize templating any file with instantiate()
- PR #543: memtier: implement reserved CPUs pool
- PR #545: resource-manager: run post-update hooks after reconfiguration
- PR #546: go.mod: update to Kubernetes v1.19.4
- PR #547: scripts: helper for maintaining replace lines in go.mod
- PR #549: benchmark: add memtier_benchmark for memcached/redis
- PR #550: test/functional: prevent read/write data race in klog
- PR #553: docs: quote text containing '<' and '>' using `` in affinity docs
- PR #555: scripts/update-gh-pages: more intelligent http redirect
- PR #556: e2e: allow user override for VM_SSH_USER over distro-ssh-user
- PR #557: Improve CPU prioritization
- PR #560: e2e: add vm-put-docker-image to the vm library
- PR #561: memtier: rework building of pool tree by HW topology
- PR #562: docs: improve rdt documentation
- PR #563: memtier/pool test: fix fd leakage causing test panics with more data
- PR #566: Kata container support
- PR #567: config: handle composite structs in Module.validate()
- PR #568: control/rdt: add option to disable rdt monitoring
- PR #570: page-migrate: add cache-like container.GetPodID()
- PR #571: config: fix typo in log message
- PR #572: control/rdt: fix and simplify handling of implicit disabling
- PR #573: control/rdt: empty class implies root class
- PR #574: control/rdt: implement assignAll()
- PR #575: control/rdt: do forceful rdt (re-)configuration
- PR #576: control/rdt: correct usage of checkIdle() in configNotify()
- PR #577: control/rdt: implement operating modes
- PR #579: memtier: don't imply error by signature for functions that never fail
- PR #580: docs: use an explicit version of recommonmark
- PR #581: rdt: accept missing default classes in Discovery mode
- PR #583: docs: refer to the latest release in the installation instructions
- PR #586: rdt: use root class as a fallback to missing classes
- PR #587: e2e: set imagePullPolicy on every test pod
- PR #588: memtier: unify syntax for annotated preferences
- PR #589: memtier: fix build error introduced by improper, unrebased merging of both #524 and #543
- PR #590: memtier: more regular annotation interpretation for CPU allocation preferences
- PR #591: fix: nil pointer dereference on updateSharedAllocations(nil)
- PR #592: e2e: unified memory-type and cold-start annotation syntax
- PR #594: policy/builtin/*: fix outdated comment about PolicyName
- PR #595: docs: recognize/handle .md-links to element IDs
- PR #596: server,resource-manager: flush logs after every request/event processed
- PR #597: resource-manager: rename 'memtier' policy to 'topology-aware'
- PR #598: podpools: policy for pod-granularity workload placement
- PR #599: rdt: fix order of params passed to GetTasksInContainer()
- PR #600: test: drop stale rdt testdata
- PR #601: topology-aware: improved topology tree/node dump
- PR #602: cpuallocator: add CPU priority levels
- PR #604: e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
- PR #606: Extended detection of Intel Speed Selection Technology (SST)
- PR #607: klog: skip headers for journald by default
- PR #608: cri-resmgr: increase allowed service journal log bursts
- PR #609: fixes: topology-aware policy cpu/memory accounting fixes
- PR #610: resource-manager: force full reallocation when switching policies
- PR #612: topology-aware: force reserved/kube-system containers to the root
- PR #613: e2e: add tests for memset expansion in topology-aware policy
- PR #614: resource-manager,dump: dump extra data for message disambiguation
- PR #615: topology-aware: better and more readable logs
- PR #616: topology-aware: memory accounting and memset expansion fixes
- PR #617: resource-manager: catch containers earlier when they are gone
- PR #618: e2e: update podpools tests to pass with new cpuallocator
- PR #622: topology-aware: use normal as fallback for reserved
- PR #623: e2e: update tests on pinning reserved CPUs
- PR #624: topology-aware: use prettyMem() in log messages
- PR #625: cache: (over)write cache file atomically
- PR #626: resource-manager: save cache at startup
- PR #627: cache: log name on pod/container removal
- PR #628: rdt: support L2 cache allocation
- PR #629: topology-aware: fix filtering out nodes with insufficient memory
- PR #630: topology-aware: fix moving up memory grant
- PR #631: pkg/sysfs: clarifying comment on getCPUMapping()
- PR #632: e2e: update dynamic page demotion tests
- PR #634: sample-configs: make cri-resmgr-configmap.example.yaml usable
- PR #636: podpools: fix reflect JSON tag typo
v0.4.1: Improved documentation, end-to-end testing, bug fixes.
The documentation in this release has been overhauled with significant structural improvements and additional content over previous ones. End-to-end test coverage has been vastly extended and the test framework significantly improved. This release contains a number of important bug fixes and a few other functional improvements. Here is a non-exhaustive list of these.
Bug fixes
- agent:
- refuse to start if
NODE_NAMEenvironment variable is not specified
- refuse to start if
- memtier policy:
- fix updating containers after shared pool changes
- honor CPU isolation opt-out preference
- honor allowed CPUs in resource discovery
- fix PMEM-only NUMA node assignment for weird topologies
- static-pools policy:
- make dynamic (re-)configuration work properly
- look for cmk isolate when parsing container command line
- re-load legacy config on config update
- only take pools configuration from legacy config
- improved sanity check on pool configuration
- fix node tainting
- cri-resmgr:
- fill in defaults for unspecified values in configuration
Other Improvements
- cri-resmgr:
- dump outbound requests if debugging is enabled for the 'cri/relay' source
- resource controllers:
- page-migrate: split out page-migration into a controller of its own
- e2e test framework
- vastly improved test coverage on multiple distros
- builds:
- build binary dist tarballs
Difference wrt. Rolling Master
With the exception of the PRs listed below, all others in the inclusive range #411 - #527 has been cherry-picked or back-ported from the rolling master branch to this release. The omitted PRs have been excluded due to backwards compatibility or other similar reasons:
#525: cri-resmgr: reuse 'rdt' logger for the split out rdt package#490: rdt: use goresctrl#497: pkg/log: switch logger to use klog#472: e2e: add tests for static-pools#489: static-pools: slight refactoring and renaming#483: static-pools: lazier node updates#475: static-pools: drop all cmdline flags
v0.4.0: Improved support for Memory Tiering, Binary packages
Major changes
- 'topology-aware' policy superseded by 'memtier'
- support for cold start of containers
- support for dynamic demotion of memory
- support for limiting container top tier/DRAM memory usage (require kernel support)
- support for externally adjusting container resource assignments
- multi-die aware resource allocation
- binary distribution with packages for popular Linux distributions and images at Docker Hub
Detailed changelog
Policies
- 'topology-aware' policy superseded by 'memtier', which
- is a forked and improved version of 'topology-aware'
- has the same basic functionality
- has a number of improvements and extra functionality:
- multi-die topology support
- multi-tier (DRAM/PMEM) memory support
- top tier/DRAM memory limiting
- container 'cold start' support: force containers initially exclusively to PMEM
- experimental dynamic page demotion: periodically move least-used pages from DRAM to PMEM
- experimental support for dynamic external adjustments to container resource assignments
- has a bunch of resource assignment/allocation fixes (which are not backported to 'topology-aware' any more)
- will in the next release replace 'topology-aware' altogether
- static-pools:
- compatibility back-ports from CMK: advertise CPUs in 'shared', 'infra' pools via
CMK_CPUS_SHARED,CMK_CPUS_INFRAenvironment variables
- compatibility back-ports from CMK: advertise CPUs in 'shared', 'infra' pools via
- common:
- support for new Pod annotation controls:
- opt out from automatic topology hint generation:
topologyhints.cri-resource-manager.intel.com/pod: falsetopologyhints.cri-resource-manager.intel.com/container.$name: false
- set DRAM/top tier memory limit:
toptierlimit.cri-resource-manager.intel.com/pod: $limittoptierlimit.cri-resource-manager.intel.com/container.$name: $limit
- opt out from automatic topology hint generation:
- make simple container affinities always implicitly symmetric
- limit user-defined container affinity to [-1000,1000]
- re-trigger pod cgroupfs parent directory and QoS class discovery if necessary
- support for new Pod annotation controls:
Resource controllers
- RDT:
- remove controller-level class name mapping
- don't consider assignment to a default class an error if no classes are defined
- fix crash/misplaced logging of group deletion
- Block I/O:
- remove controller-level class name mapping
- don't consider assignment to a default class an error if no classes are defined
- CRI:
- properly send out generated/queued
UpdateContainerResourcesrequests
- properly send out generated/queued
Data collectors
- cgroupstats:
- use/report container IDs
- fix hugetlb size parsing
- avx:
- switch to cilium/ebpf from iovisor/gobpf
cri-resmgr
- new command line options:
- reset cached configuration:
--reset-config - reset cached policy data:
--reset-policy
- reset cached configuration:
- always set up node agent connection, even when running with
--force-config - allow switching policies during startup, unless started with
--disable-policy-switch
Packaging
- install sample fallback config as fallback and not real config file
- use
/etc/defaultfor defaults on debian-based distros - support Ubuntu 20.04, OpenSUSE 15.2
Documentation
- automatic generation and publishing of documentation to github pages
- a number of documentation fixes and clarifications
Testing
- end-to-end test framework added
v0.3.1: Packaging and build fixes
This v0.3.1 patch release adds packaging and build fixes on top of the v0.3.0 release.
Changes:
- feature: add command line options for resetting the active policy in the cache and allow this to happen automatically during startup if necessary
- fix: NUMA CPU-/memory-attachment detection code to work with older kernels
- fix: move from gobpf to Cilium-based AVX eBPF implementation to address build issues on older kernel
- fix: add targets for containerized cross-builds for distro packages
v0.3.0: Memory management improvements
- added memory-tiering policy:
topology-awarepolicy with support for DRAM, PMEM (Intel Optate DC) and HBM (High Bandwidth Memory) allocation - added blockio controller: class-based control over block I/O using the cgroupfs blkio controller
- added support for metrics collection:
- collection of raw metrics data, exporting to Prometheus
AVX512usage: collect per container AVX512 instruction usage, tag containers accordingly
- rdt controller improvements: disjoint partitioning, L3 and memory bandwidth monitoring, and Intel RDT metrics
- new annotations:
- assign full pod or a container to
block I/OorRDTclass:rdtclass.cri-resource-manager.intel.com/container.$container: class-namerdtclass.cri-resource-manager.intel.com/pod: class-nameblockioclass.cri-resource-manager.intel.com/container.$container: class-nameblockioclass.cri-resource-manager.intel.com/pod: class-name
memtierpolicy preference for type of memory allocated to a container:memory-type.cri-resource-manager.intel.com:
$container: [dram,][pmem,][hbm]
- assign full pod or a container to
v0.2.0: More generic runtime configuration handling.
Implement a more general, unified mechanism for handling runtime configuration.
v0.1.0: First publicly available release
Initial release for the project with major functionality available in alpha state.
Note: this is pre-production Alpha release. Not for production use!