Skip to content

Conversation

@matelakat
Copy link

Implement having multiple corosync rings. See commit message for more details

For increased stability, people want to have multiple corosync rings, so
if one network fails, node still won't be fenced.

This change adds the `ring_mode` and `second_ring_network` attributes to
the proposal. `ring_mode` can be either `single_ring` or `dual_ring`:
 - `single_ring`: only one ring used by corosync, and it is using the
 admin network
 - `dual_ring`: two rings used, one is on the admin network, the other
 network can be specified by the user by providing a value for
 `second_ring_network`

Notes:
 - double ring configuration only implemented for UDPU, if you use a
 different mode, the propsal will fail to validate
 - after moving to double ring configuration, user has to manually
 re-start the corosync cluster. He can do it by running the file:

/opt/dell/chef/cookbooks/pacemaker/files/default/reload_corosync_configuration.sh
Copy link

@houndci-bot houndci-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files could not be reviewed due to errors:

.rubocop.yml: Style/AlignHash has the wrong namespace - should be Layout
.rubocop.yml: Style/AlignHash has the wrong namespace - should be Layout
.rubocop.yml: Style/AlignParameters has the wrong namespace - should be Layout
.rubocop.yml: Style/EmptyLinesAroundBlockBody has the wrong namespace - should be Layout
.rubocop.yml: Style/MultilineOperationIndentation has the wrong namespace - should be Layout
Error: obsolete parameter RunRailsCops (for AllCops) found in .rubocop.yml
Use the following configuration instead:
Rails:
  Enabled: true
obsolete parameter AlignWith (for Lint/EndAlignment) found in .rubocop.yml
`AlignWith` has been renamed to `EnforcedStyleAlignWith`


= select_field %w(corosync ring_mode), :collection => :ring_modes_for_corosync, "data-showit" => "dual_ring", "data-showit-target" => "#dual_ring_container", "data-showit-direct" => "true"

%div{ :id => "dual_ring_container" }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id attribute must be in lisp-case

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hound is probably right here too: dual-ring-container.

.panel-body
= select_field %w(corosync transport), :collection => :transport_for_pacemaker

= select_field %w(corosync ring_mode), :collection => :ring_modes_for_corosync, "data-showit" => "dual_ring", "data-showit-target" => "#dual_ring_container", "data-showit-direct" => "true"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is too long. [192/150]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the dog.

@matelakat matelakat requested a review from aspiers August 8, 2017 09:02
Copy link

@aspiers aspiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks great! Just a few nits and some hardening to the upgrade script.

default[:corosync][:transport] = "udp"
default[:corosync][:ring_mode] = "single_ring"

default[:corosync][:second_ring_used] = false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both :ring_mode and :second_ring_used? Is there a circumstance in which :ring_mode would be "double_ring" and :second_ring_used would be false?

# dual corosync ring.
# After you applied the modified proposal successfully, run this script on
# crowbar
# Please stop all chef-client runs while running this script.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be automated, or at least checked before proceeding?

# Please stop all chef-client runs while running this script.
set -eux

CLUSTER_PROPOSAL_NAME="$1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do some input validation on this.

# if this command fails, the script exits
crowbarctl proposal show pacemaker "$CLUSTER_PROPOSAL_NAME"

NODES=$(crowbarctl \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit (take it or leave it): stylistically, with multi-line $( ... ) I prefer a line-break after the $( so that the arguments to crowbarctl are indented more than crowbarctl, not less, which also means that the commands in the pipeline are vertically aligned, and the $( ) clearly delimit the whole pipeline:

NODES=$(
    crowbarctl proposal show pacemaker "$CLUSTER_PROPOSAL_NAME" \
        --format=plain \
        --filter "deployment.pacemaker.elements.pacemaker-cluster-member" |
    cut -d " " -f 2
)

done

# Print out initial configuration
ssh "$FIRST_NODE" corosync-cfgtool -s
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, shouldn't it bail?

role.default_attributes["corosync"]["members"] = member_nodes.map{ |n| n.get_network_by_type("admin")["address"] }
role.default_attributes["corosync"]["transport"] = role.default_attributes["pacemaker"]["corosync"]["transport"]

# Set up the second ring if dual_ring mode configured
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you add this large chunk of code as a separate method? Long methods really harm readability and maintainability.

role.default_attributes["corosync"]["second_ring_bind_addr"] = second_ring_net["network"]["subnet"]
role.default_attributes["corosync"]["second_ring_mcast_addr"] = role.default_attributes["pacemaker"]["corosync"]["mcast_addr"]
role.default_attributes["corosync"]["second_ring_mcast_port"] = role.default_attributes["pacemaker"]["corosync"]["mcast_port"]
role.default_attributes["corosync"]["second_ring_members"] = second_ring_members
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me or do you now have both [:corosync][:second_ring_used] and [:pacemaker][:corosync][:second_ring_used]? And I don't see why either are needed - they seem to be duplicating the purpose of ring_mode.

.panel-body
= select_field %w(corosync transport), :collection => :transport_for_pacemaker

= select_field %w(corosync ring_mode), :collection => :ring_modes_for_corosync, "data-showit" => "dual_ring", "data-showit-target" => "#dual_ring_container", "data-showit-direct" => "true"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the dog.


= select_field %w(corosync ring_mode), :collection => :ring_modes_for_corosync, "data-showit" => "dual_ring", "data-showit-target" => "#dual_ring_container", "data-showit-direct" => "true"

%div{ :id => "dual_ring_container" }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hound is probably right here too: dual-ring-container.

hae_repo: 'The HAE repositories have not been setup.'
transport_value: 'Invalid transport value: %{transport}.'
ring_mode_value: 'Invalid ring mode: %{ring_mode}.'
second_ring_network_value: 'Network not found for setting up second corosync ring: %{second_ring_network}.'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I prefer Network "%{second_ring_network}" not found for setting up second corosync ring.' - I think it's very slightly clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants