Releases: ComputeCanada/magic_castle
Releases · ComputeCanada/magic_castle
Magic Castle 7.0
Changed
- Established a distinction in variables between puppetmaster and mgmt1 - allowing puppetmaster role to be assigned to another instance.
- Bumped minimum requirement of terraform to 0.12.24 (issue #77)
- Numerous doc fixes
- Added a section on related projects in README.md
- [Azure] Updated Azure infrastructure.tf to use Azure provider 2.0.0 (issue #62)
- [cloud-init] Set puppet-agent and puppet-server version to 6.13 and 6.9
- [cloud-init] Renamed cloud-init YAML files to
puppetagent.yaml
andpuppetmaster.yaml
- [OpenStack] Fixed volume size computation regression introduced in commit c09ea17
- [puppet] Defined selinux context for /scratch as home_root_dir
- [puppet] Defined selinux context for /project as home_root_dir
- [puppet] Improved cuda facts to avoid issue when html index is incomplete
- [puppet] Updated package names in gpu module and facts
- [puppet] Generalized gpu module cuda repo link composition
- [puppet] Replaced package by ensure_packages for kernel-devel in gpu
- [puppet] Updated version of puppet-jupyterhub to v3.3.0
- [puppet] Improved FreeIPA client installation waiting conditions to limit failure
- [puppet] Disabled root jobs in slurm.conf]
- [puppet] Added nosuid to client nfs mount options
- [puppet] Activated root_squash for all nfs exports
- [puppet] Changed URL for the source of
cc-tmpfs_mount.so
- [puppet] Updated derdanne/nfs version in Puppetfile
- [puppet] Made profile::base a requirement of profile::nfs::server
- [puppet] Defined servername param for apache in reverse_proxy
Added
- [Azure] Added variable to allow usage of an existing resource group based on its name (issue #72)
- [cloud-init] Enabled puppet agent postrun command in cloud-init
- [puppet] mgmt1 volumes formating is now handled by
profile::nfs::server
class - [puppet] Added logic to define, mount and format nfs shared volumes with lvm
- [puppet] Added README.md
- [puppet] Fixed regression introduced in 630a04
- [puppet] Added possibility to manage jail activation and ignore_ip with hierada
- [puppet] Added profile classes for JupyterHub:
profile::jupyterhub::node
andprofile::jupyterhub::hub
- [puppet] Added variable to allow definition of lmod default modules with hieradata
- [puppet] Configured lmod default modules to start with gcc and openmpi
- [puppet] Added ability to receive last puppet run output by email through puppet postrun script
- [puppet] Added support for NVIDIA GRID vGPU
- [puppet] Added class
profile::base::azure
for logic specific to Azure
Removed
- [cloud-init] Removed volumes formating, partitioning and mounting from mgmt cloud-init
- [puppet] Removed condition on gpu count in nvidia_driver_vers
- [puppet] Removed mkhomedir from FreeIPA client installation parameters
Magic Castle 6.4
Changed
- [cloud-init] Hardcoded the version of puppet-agent (6.13.0) and puppetserver (6.9.1). This fixes an issue with fetching files from HTTPS source introduced in Puppet 6.14.0.
Magic Castle 6.3
Added
- Added random_uuid to generate a random consul token
- [travis] Added init and validation of dns/gcloud module
- [cloud-init] Added bootstrap installation of consul-server in cloud-init
- [puppet] Added slurmd restart when node is missing from sinfo
- [puppet] Introduced class
profile::workshop::mgmt
. The class allow to unzip an archive in all guest accounts - [puppet] Added profile::workshop::mgmt to mgmt in site.pp
- [puppet] Defined consul::service for slurmd, slurmctld slurmdbd, rsyslog, cvmfs client, and squid. This in conjunction
with consul-template, allow these services to be removed from the config files when the instance that was running the
service is halted. For example, if a compute node is shutdown or remove, it will no longer appear insinfo
output.
Changed
- [cloud-init] Turned off puppet agent reporting in cloud-init
- [cloud-init] [puppet] Renamed user_hieradata as user_data
- [cloud-init] Volume formating and mounting is now conditional on the hostname being
mgmt1
- [OpenStack] Fixed port_node resource name template
- [puppet] Updated puppet-jupyterhub version to v1.8.1
- [puppet] Consul and consul-template version are now defined in hieradata
- [puppet] Changed node_exported consul service name to node-exported to remove warning
Removed
- [puppet] Removed unused key from terraform_data
- [puppet] Removed stage in mgmt site.pp
Magic Castle 6.2
Added
- Added an error message in cloud-init dev avail while loops
- Added gcloud dns module to AWS, Azure and OVH examples.
- [puppet] Added a slurmd restart when node hostname is missing from sinfo output.
- [puppet] Added class profile::workshop::mgmt to deploy files to guest user homes
- [puppet] Added class profile::workshop::mgmt to mgmt1 in site.pp
Changed
- [OpenStack] All resources, including instances, have now a name that starts with the cluster name.
This does not affect the instances' hostname - [puppet] Update puppet-jupyterhub version to v1.7
Magic Castle 6.1
Fix travis release procedure. 6.0 release bundles contained the wrong module source in main.tf.
Magic Castle 6.0
Terraform >= 0.12.21 is now required. Usage of the function subtract
requires at least 0.12.21.
Added
- Added the optional key
prefix
to theinstance["node"]
map (issue #29) - [cloud-init] Added removal of ifcfg file with no corresponding nic (issue #61)
- [puppet] Added optional prefix to node regex in site.pp
Changed
instance["node"]
is now a list. This allows the spawning of compute nodes with various instance types (issue #29)- release.sh is now the only script for creating a release on any platform.
- [Azure] Renamed azurerm_virtual_machine nodevm to node (issue #55)
- [AWS] Replaced aws volume device name by volume id (issue #60)
- [gcp] Renamed gcp var.project_name to var.project (issue #53)
- [puppet] Upgraded puppet-prometheus to 8.2.1
- [puppet] Remove Name=gpu from gres.conf template (puppet-magic_castle issue #27)
Magic Castle 5.8
Added
- [Azure] Added
root_disk_size=30
in the example (issue #43) - [Azure] Added ssh_keys to instances as it is mandatory (issue #44)
- [cloud-init] Added volume attachment verification loops in mgmt cloud-init (issue #54)
- [GCP] Added gcloud dns module to gcp example (issue #37)
- [GCP] Added prefix to the name of volumes and ipv4 (issue #49)
- [OpenStack] Added os_int_subnet variable. The variable is used to force to use a specific subnet with Openstack.
Changed
- Changes to image variable are now ignored after cluster is built.
- Fixed release scripts to solve bug where multiple
version
variable were present (issue #38) - [Azure] Updated the example to use the most recent OpenLogic CentOS 7 image (issue #42)
- [Azure] Resources names are now prefixed with the cluster name
- [Azure] Azure public_ip now outputs a list of all login ip addresses
- [cloud-init] Replaced timezone in cloud-init to UTC (issue #51)
- [GCP] Zone variable is now optional. The zone is randomly selected in the zones available for the region if left blank.
- [GCP] Instances internal DNS are now configured to use zonal DNS. The internal DNS hostname is not used, but the change reduced
the DHCP time lease from 24 to 1 hour. This helps when debugging DHCP issue - [puppet] CC CVMFS repo is now configured from latest RPM repo (issue puppet-magic_castle issue #19)
- [puppet] Increased the squid maximum_object_size (puppet-magic_castle issue #20)
- [puppet] Updated globus rpm repo name
- [puppet] Updated fail2ban config to make it work with 0.10.x (puppet-magic_castle issue #25)
- [puppet] Replaced file by exec to create singularity symlink (puppet-magic_castle issue #24)
- [puppet] Replaced timezone in cloud-init to UTC (puppet-magic_castle issue #26)
- [puppet] Added service network and notify on NetworkManager purge (issue #50)
Removed
- [GCP] Zone variable is no longer in the example as it is now optional.
Magic Castle 5.7
Added
- [AWS] Added
skip_destroy = True
to EBS attachment resources to avoid stalling destroy command. - [DNS] Added a
dtn
entry for the Globus endpoint. - [DNS] Added an
ipa
entry that provides access to FreeIPA webpage. - [puppet] Added a
profile::reverse_proxy
class that configure Apache vhost for JupyterHub, FreeIPA, Globus, etc. - [puppet] Added service nvidia-persistenced to module gpu.pp.
- [puppet] Added
drain
to states that spawns an scontrol in slurm module. - [main.tf] Added
hieradata
variable that allow the injection of custom values in puppet hieradata from Terraform.
Changed
- [AWS] Changed mgmt and login instances from using
associate_public_ip_address
to using an AWS Elastic IP. - [AWS] Updated example AMI and minimum instance type for mgmt.
- [AWS] Fixed module's syntax for Terraform 0.12.
- [AWS] Made
availability_zone
optional. If zone is not provided, it will be randomly selected amongst the zones available for the selected region. - [AWS] Changed root disk type from
standard
togp2
. - [AWS] Enabled ebs_optimized for all instances.
- [AWS] Changed SSH keyname from
slurm-cloud-key
to${cluster_name}-key
- [cloud-init] Made puppet yumrepo install function of the CentOS major version.
- [cloud-init] Added blacklisting of nouveau driver in kernel cmdline option.
- [DNS] DNS records are now produced by the
record_generator
module instead of listing records in each DNS provider module. - [puppet] Changed Globus authentication method from MyProxy to OAuth.
- [puppet] Updated puppet-jupyterhub version from v1.1 to v1.6.
- [puppet] Replaced deprecated package name dkms-nvidia by kmod-nvidia-latest-dkms.
- [puppet] Replaced every reference to facts of
eth0
by facts of interface index 0. - [puppet] Disabled dkms autoinstall timeout in gpu.
Removed
- [puppet] Removed include of jupyterhub::reverse_proxy in site.pp for login.
Magic Castle 5.6
Added
- [DNS] Added support for Google Cloud DNS (PR #24)
- Added a release script compatible with BSD tools -
release.bsd.sh
Changed
- [DNS] Changed the login record A pattern from
clustername#.domain
tologin#.clustername.domain
where#
is the login node index. - [DNS] Moved wildcard certificate creation from cloudflare to an acme module shared by all dns modules.
- [DNS] Replaced usage of 0-index on array by call to
distinct
(issue #26). - [DNS] A
jupyter.${cluster_name}.${domain_name}
record is now added for each login instead of just login1. - Changed the management and login node naming scheme to match node naming.
mgmt01
is nowmgmt1
andlogin01
is nowlogin1
. - [puppet] Fix puppet-jupyterhub version to v1.1 instead of master branch.
Removed
- [DNS] Removed creation of SSHFP SHA1 records (issue #22)
Magic Castle 5.5
Added
- [docs] Added details on how to use CloudFlare API Token in README.md
- [docs] Added details on which Open RC File to download when using OpenStack
- [DNS] Added an email variable to the dns module.
- [puppet] Added logic to set RSNT_ARCH variable based on the common CPU
instruction extensions available amongst all CVMFS clients using consul.