Graphing NVMe status #munin #nvme #backupfriday

Recently at work we had some SSDs fail - a colleague checked some of them, and the SSDs reported that they had passed 200% of their lifetime!
My home server and my laptop does not see anything near the usage that those production SSDs do, but I would like to have an idea of what the status is and how it is developing - #backupfriday is all well and good, but prevention is nice too.
I started looking at the
nvme-cli
package
and some of what it can output, but wasn't sure what to graph, so I
asked on the fediverse, and got a nice
reply with a
link to the NVMe specification.
So today I installed Munin on my home
server, found the nvme
plugin
and modified
it
to graph the percentage_used
value that I am interested in.
For some reason my three NVMe's report "NVME Namespace Usage" to be 100%, which is not what the plugin expects, so I had to configure it to not warn me about that:
# cat /etc/munin/plugin-conf.d/nvme
[nvme]
user root
env.nvme_usage_warning 101
env.nvme_usage_critical 101
Now I have Munin graphing percentage_used
, and many more things,
sweet.
Add comment
How to in excruciating detail…
To avoid spam many websites make you fill out a CAPTCHA, or log in via an account at a corporation such as Twitter, Facebook, Google or even Microsoft GitHub.
I have chosen to use a more old school method of spam prevention.
To post a comment here, you need to:
- Configure a newsreader¹ to connect to the server
- Open the newsgroup called
¹ Such as Thunderbird, Pan, slrn, tin or Gnus (part of Emacs).koldfront.dk
on port1119
using nntps (nntp over TLS).lantern.koldfront
and post a follow up to the article.Or, you can fill in this form: