Monitoring software RAID status (mdstat)
Mdadm is a Linux utility to manage and monitor software RAID. The mdadm monitoring plugin keeps track of failed disks in a RAID setup.
Mdadm metrics
- Faulty
- Active
- Resync
- Read only
Mdadm dependancies
This plugin needs the mdstat python plugin, you can install this by running pip install mdstat
.
To fetch the status we also have to give this plugin sudo access, edit /etc/sudoers
and add the following at the end of the file.
nixstats ALL=(ALL) NOPASSWD: /usr/local/bin/mdjson
This is considering mdjson is located at /usr/local/bin/mdjson
to verify it's location run whereis mdjson
Testing the mdadm plugin
Run sudo -u nixstats nixstatsagent test mdstat
to check if it's returning any data.
root@nixstats:~# sudo -u nixstats nixstatsagent test mdstat
mdstat:
{
"md0": {
"active": 1,
"faulty": 0,
"read_only": 0,
"resync": 0
},
"md1": {
"active": 1,
"faulty": 0,
"read_only": 0,
"resync": 0
},
"md2": {
"active": 1,
"faulty": 0,
"read_only": 0,
"resync": 0
}
}
Enable the plugin
Open /etc/nixstats.ini
and append the following lines at the end of the file.
[mdstat]
enabled = yes
Restart the plugin by running service nixstatsagent restart
Creating charts
Click on the Metrics link on the top menu, now select "mdstat" as metric type and active/faulty/read_only/resync as metric, choose the servers you would like to graph and save it to your dashboard.
You can create an alert for in case the faulty
metric is higher than zero, indicating that a server has a bad drive.
Updated on: 15/03/2018
Thank you!