Management Module failover overview
Controlled failover: The user triggers this type of failover by rebooting the Active MM or running the
redundancy switchover
command.Uncontrolled failover: This type of failover is triggered by unexpected events like a crash on the Active MM or hot removal of the Active MM.
A mailbox interrupt is received from the Active MM to indicate takeover. This interrupt can come for controlled or uncontrolled failover (except for a hot removal).
Active MM hot removal detection.
- Heartbeat loss detected on the Standby MM for more than 10 seconds.NOTE:
If the Active MM is not responding and is still not detected by the first two methods, it will be caught by this method.
The Standby MM must be present to trigger a failover. An Unassigned MM will never trigger a failover.
The Redundant Management Daemon (
hpe-rdntmgmtd
) is responsible for triggering failover from the Standby MM.When a failover is triggered, the Standby MM becomes the Active MM while the previously Active MM is rebooted.
The Active MM must be present to trigger a recovery.
The Redundant Management Daemon (
hpe-rdntmgmtd
) is responsible for triggering recover from the Active MM.When a recovery is triggered, the Active MM reboots the nonresponsive Standby MM. This action occurs for any of the following conditions:
The failover monitor thread on the Standby MM will increment the heartbeat failed count.
- The
hpe-rdntmgmtd
daemon on the Standby MM will:Detect the failover condition due to heartbeat fail count increasing past the maximum of 10 and triggering failover
Initiate reboot of the Active MM.
Active MM will join as a standby after reboot.
The recover monitor thread on the Active MM will increment the heartbeat failed count.
- The
hpe-rdntmgmtd
daemon on the Active MM will:Detect the recover condition due to heartbeat fail count increasing past the maximum of 7 and triggering recover.
Initiate reboot of Standby MM.
Standby MM will join as a standby after reboot.
A planned reboot on the Active MM will send a failover command to the Standby MM.
- The
hpe-rdntmgmtd
daemon on the Standby MM will:Process this command and perform a failover immediately instead of waiting for the failover monitor to detect it using heartbeats.
Initiate reboot of the Active MM.
Active MM will join as a standby after reboot.
Removal of the Active MM from Slot 1 triggers the
hpe-rdntmgmtd
daemon on the Standby MM to initiate failover immediately instead of waiting for the failover monitor to detect it using heartbeats.Active MM will join as a standby after reboot.
A crash on the Active MM is handled by the crash handler, which sends a failover command to the Standby MM.
- The
hpe-rdntmgmtd
daemon on the Standby MM will:Process this command and perform failover immediately instead of waiting for the failover monitor to detect it using heartbeats.
Initiate reboot of the Active MM.
Active MM will join as a standby after reboot.
redundancy switchover
command:
User executes the
redundancy switchover
command on the Active MM.This action will send a takeover signal to the Standby MM and reboot the Active MM.
The
hpe-rdntmgmtd
daemon on Standby MM will process this takeover signal and perform failover immediately.Active MM will join as a standby after reboot.
Why did my second MM not take over after Active failed?
The second MM must be elected as Standby and in a ready state before failover. If not, a double fault occurs and the second MM will not take over.