Skip to content

DCL task list #118

Open
Open
@ranguard

Description

@ranguard

Description

This is a list, in priority order of what the MetaCPAN NOC would like help from DCL on.

Action

  • ?: Impliment loki/grafana/prometheus

Planned

  • Review DCL findings and recommendations
    • Cluster node setup (what should we expand it to - to reduce memory issues)
  • Collecting log output from containers, maybe ingress logging option? - n.b forwarding ip's!
  • Best practices recommendations, yaml lint etc
  • Better container/node monitoring (how much memory does X container need, what is using all the processes in the cluster)
  • Review and update all app configs - setup best practices (affinity, limits, etc)
  • Support K8s access for multiple users/roles/projects in one cluster, e.g. if we want to give project X access how do we partition both access and resources (Rancher?)
  • Discuss storage options for moving cpan store previous out of date discussion
  • Simplest way to backup DO PG (ideally to BackBlaze - s3 storage), currently useing https://app.snapshooter.com/ (the free account should be enough)?

Completed

  • Basic k8s monitoring https://opsview.dcmanaged.com/ - MetaCPAN Noc has access
  • Slack channel for discussion + honewycomb.io error alert integration
  • LL: Increase cluster memory, use 3 x 16G instances as we are running at ~90%
  • LL: Start using https://k8slens.dev/ for viewing cluster information, will improve with grafana/prometheus/loki implimented

Metadata

Metadata

Assignees

No one assigned

    Labels

    Who: DCLDigitalCraftsMen working on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions