Objective
How to handle the VOLUME_ALARM_DATA_UNDER_REPLICATED alarm when it is raised in an HPE Ezmeral Data Fabric cluster.
2018-08-02 03:02:34,515 WARN Alarms [RScan]: composeEmailMessage: Alarm raised: VOLUME_ALARM_DATA_UNDER_REPLICATED:1:VOLUME_ALARM; Cluster: my.cluster.com; Volume: mapr.cldb.internal; Message: Volume desired replication is 3, current replication is 2
2018-08-02 03:02:34,515 INFO VolumeInfoInMemory [RScan]: Volume: mapr.cldb.internal, under-replicated containers: 1
2018-08-02 03:02:34,516 WARN Alarms [RScan]: composeEmailMessage: Alarm raised: VOLUME_ALARM_DATA_UNDER_REPLICATED:199560036:VOLUME_ALARM; Cluster: my.cluster.com; Volume: users; Message: Volume desired replication is 3, current replication is 2
2018-08-02 03:02:34,516 INFO VolumeInfoInMemory [RScan]: Volume: users, under-replicated containers: 2064 2391 2392 2393 2394 2395 2506 2507 2508 2509 2510
2018-08-02 03:02:34,517 WARN Alarms [RScan]: composeEmailMessage: Alarm raised: VOLUME_ALARM_DATA_UNDER_REPLICATED:13930177:VOLUME_ALARM; Cluster: my.cluster.com; Volume: mapr.monitoring; Message: Volume desired replication is 3, current replication is 2
2018-08-02 03:02:34,517 INFO VolumeInfoInMemory [RScan]: Volume: mapr.monitoring, under-replicated containers: 2066 2240 2241 2242 2243 2244 2370 2371 2372 2373 2374
2018-08-02 03:02:34,517 WARN Alarms [RScan]: composeEmailMessage: Alarm raised: VOLUME_ALARM_DATA_UNDER_REPLICATED:228927984:VOLUME_ALARM; Cluster: my.cluster.com; Volume: mapr.monitoring.streams; Message: Volume desired replication is 3, current replication is 2
2018-08-02 03:02:34,517 INFO VolumeInfoInMemory [RScan]: Volume: mapr.monitoring.streams, under-replicated containers: 2067 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313
2018-08-02 03:02:34,518 WARN Alarms [RScan]: composeEmailMessage: Alarm raised: VOLUME_ALARM_DATA_UNDER_REPLICATED:252002128:VOLUME_ALARM; Cluster: my.cluster.com; Volume: mapr.vm75-180.support.mapr.com.local.logs; Message: Volume desired replication is 2, current replication is 1
2018-08-02 03:02:34,518 INFO VolumeInfoInMemory [RScan]: Volume: mapr.vm75-180.support.mapr.com.local.logs, under-replicated containers: 2058
2018-08-02 03:01:09,0319 INFO Replication replicate.cc:4236 Replica health update: replica 10.10.75.179:5660 CID:2118 Master Min VN:4231987, ReplicaVN on Disk:4231988 Pending RPCs:0
2018-08-02 03:01:20,8271 ERROR Replication nodefailure.cc:412 Op failed with Connection reset by peer (104) on replica FSID 1128005550106435436 10.10.75.180:5660 for operation of type 39 and version 33121799 on container 2320
2018-08-02 03:01:20,8286 ERROR Replication nodefailure.cc:412 Op failed with Connection reset by peer (104) on replica FSID 1128005550106435436 10.10.75.180:5660 for operation of type 15 and version 22015178 on container 1
2018-08-02 03:01:20,8333 INFO Replication nodefailure.cc:568 Removing replica FSID 1128005550106435436 10.10.75.180:5660 for container (1).
2018-08-02 03:01:20,8375 INFO Replication nodefailure.cc:568 Removing replica FSID 1128005550106435436 10.10.75.180:5660 for container (2320).
2018-08-02 03:01:49,9258 INFO Replication nodefailure.cc:1279 Container 2055, CLDB asked to become master BM, ifClean=1
2018-08-02 03:01:49,9258 INFO Replication nodefailure.cc:1282 FSBecomeMaster has voltype: 0, volume type from cldb: 0
2018-08-02 03:01:49,9441 INFO Replication nodefailure.cc:1652 BM Become master completed successfully for container 2055 at txn:12583149-12583149, write:12583134-12583134, snap:0-0
2018-08-02 03:01:50,9279 INFO Replication nodefailure.cc:1279 Container 2058, CLDB asked to become master BM, ifClean=1