Skip to content

Conversation

@vuntz
Copy link

@vuntz vuntz commented Feb 21, 2017

The only changes compared to the current PTF are DRBD-related (so not relevant for us) and a small bug fix in the rails app.

Adam Spiers and others added 18 commits October 26, 2016 15:47
Our corosync.conf.v2 template was based on a version from SLE HA which
had some tunings accidentally dropped in the migration from SLE11.
In particular, some of the timeouts were too aggressive.

https://bugzilla.suse.com/show_bug.cgi?id=1001164

(cherry picked from commit c1d9c8f)
…able

[stable/3.0] fix corosync.conf values (bsc#1001164)
It seems new versions requires a newer version of rack, that requires
ruby 2.2.2. See https://travis-ci.org/crowbar/crowbar-ha/jobs/142499249

(cherry picked from commit 5191cfd)
[3.0] pacemaker: Force use of old rack version
Rake 12.0.0 got released on Dec 6 but our rspec version still uses
removed code.

(cherry picked from commit d7dc20f)
It happened after reinstallation one of the compute nodes (which acts as pacemaker-remote).
The pacemaker barclamp couldn't apply anymore and it throws the error

"Failed to apply the proposal: exception before calling chef (undefined method `sort' for nil:NilClass)"

(cherry picked from commit 4c82aa6)
[3.0] Fix undefined method `sort' for nil:NilClass
If the config parameters are changed, it's too risky to just restart the
cluster - this could happen on all cluster nodes at a similar time and
cause a significant outage.  Fortunately it's possible to instead reload
the config whilst keeping corosync running, via corosync-cfgtool -R.

https://bugzilla.suse.com/show_bug.cgi?id=1001164
(cherry picked from commit a446fe9)
…kport

[3.0] don't restart corosync when corosync.conf changes (bsc#1001164)
  https://bugzilla.suse.com/show_bug.cgi?id=971771

We only want DRBD started by Pacemaker, with the possible exception of
during initial DRBD setup.  If it gets started by systemd, systemd will
believe it owns the service, in which case during system shutdown it
will prematurely shut down DRBD without regard for any of the other
services and resources depending on it.  Instead we want Pacemaker to
stop things in the correct order, even taking care of inter-node
dependencies.

This fix will not be sufficient by itself, since we still need to
address:

  https://bugzilla.suse.com/show_bug.cgi?id=980341

Thanks to Adam Spiers <[email protected]> (for the fix & for writing the commit
message)

(cherry picked from commit 52d11b7)
It should only be started by pacemaker (to avoid ownership conflict with
systemd), and therefore we should not require drbd to be running to do
the setup.

This is a followup of 52d11b7 and is required to properly fix
https://bugzilla.suse.com/show_bug.cgi?id=971771

(cherry picked from commit fbd12df)
drbd-overview has changed in SP2 and it will not give us the
previous "Unconfigured" status

We cherry-pick this not because of SP2, but because we don't want to
depend on drbd-overview at this point in time (since it requires drbd
running).

(cherry picked from commit a2a08dc)
Partial cherry-pick from:

Adapt to the latest (SP2-based) drbd-overview output

(cherry picked from commit c8fa28c)
Due to fbd12df, the drbd daemon may not run (as we let pacemaker start
it, and for the first resource, it will therefore not run yet), so
drbd-overview will not work.

We were using drbd-overview to make sure that all steps are working;
without it, we will just fail a bit later.

If there's a need to wait for drbd to be ready, there's a new :wait
action for the resource.

(cherry picked from commit 220caa4)
It's unclear how things could work before without this as it's a
mandatory step, see
https://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_drbd_configure.html

I can only assume that the drbd service restart was triggering this
somehow, and we were lucky to not hit any issue until now.

(cherry picked from commit 0525631)
(cherry picked from commit ef5963d)
crowbar-pacemaker: Do not start or restart drbd service (bsc#971771)
@vuntz vuntz force-pushed the staging-merge-3.0 branch from 17ee8a3 to 0cf2e65 Compare February 21, 2017 09:50
@rsalevsky rsalevsky changed the base branch from staging to stable/sap/3.0 March 9, 2017 12:35
Copy link

@tpatzig tpatzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants