<!--
SPDX-FileCopyrightText: 2024 Agathe Porte

SPDX-License-Identifier: GPL-3.0-only
-->

<p align="center"><img src="docs/logo.gif" alt="logo" width="75%" /></p>

# prometheus-diskspin-exporter

*Note: this project was renamed from prometheus-hdparm-exporter to avoid a
[possible name conflict](https://github.com/nuz014/hdparm_prometheus_exporter).*

prometheus-diskspin-exporter is a zero-dependency Python program that will export
the power state of the SATA disks found on the host and expose them to an HTTP
endpoint for consumption by [Prometheus](https://prometheus.io/).

It monitors if the disks are correctly shut down according to scheduled time in
the disk or manual calls to `hdparm -y`. It calls `hdparm -C` on every request
and parses the status of each SATA disk `/dev/sd*` reported by `lsblk`.
The program limits queries on SATA disks because NVME SSDs and other disk
types will often response with an `unknown` status to the `hdparm -C` command.

This way one can monitor disk spin status to be sure that the spin policy works
correctly: energy is saved and noise is reduced when not in use, and disks
do not spin up and down too often to prevent wear.

<p align="center"><img src="docs/granafa_hdparm.png" alt="Example grafana output graph" /></p>

The above shows disks in a ZFS pool spinning up after trying to access the
filesystem.

## Usage

```text
prometheus-diskspin-exporter HOST PORT
```

## Example

```
$ sudo prometheus-diskspin-exporter localhost 28101 &
$ curl localhost:28101/metrics
hdparm_disk_power_status{disk="/dev/sda",status="standby"} 1 1731616722000
hdparm_disk_power_status{disk="/dev/sdb",status="standby"} 1 1731616722000
hdparm_disk_power_status{disk="/dev/sdc",status="standby"} 1 1731616722000
hdparm_disk_power_status{disk="/dev/sdd",status="active/idle"} 1 1731616722000
127.0.0.1 - - [14/Nov/2024 21:38:42] "GET /metrics HTTP/1.1" 200 236
```

The last line is the log line coming from the server’s stdout in the
background; it is not present in the HTTP response.

Status values:

- `active/idle` means the disk is spinning
- `standby` means the disk is not spinning
- `unknown` means the disk did not report anything useful

## Install

The code fits a single file `main.py` and can easily be distributed on hosts
manually:

```
curl https://raw.githubusercontent.com/gagath/prometheus-diskspin-exporter/refs/heads/main/src/prometheus_diskspin_exporter/main.py \
    -o /opt/prometheus_diskspin_exporter.py
chmod +x /opt/prometheus_diskspin_exporter.py
```

If you want to install the systemd service:

```
curl https://raw.githubusercontent.com/gagath/prometheus-diskspin-exporter/refs/heads/main/systemd/prometheus-diskspin-exporter.service \
    -o /etc/systemd/system/prometheus-diskspin-exporter.service
```

If you want to install the AppArmor configuration:

```
curl https://raw.githubusercontent.com/gagath/prometheus-diskspin-exporter/refs/heads/main/apparmor.d/opt.prometheus_diskspin_exporter \
    -o /etc/apparmor.d/opt.prometheus_diskspin_exporter
```

## Grafana configuration

The `hdparm_disk_power_status` metric must be queried as a range table.

![Grafana query configuration: choose the "Table" format](./docs/grafana_query.png)

You will need to do a bit of transformations of the `hdparm_disk_power_status`
field in Granafa to obtain the *State timeline* output showed in the
introduction:

![Grafana transform data configuration: first grouping to matrix on disk
column, then convert field type Time\\disk to Time](./docs/granafa_transform.png)

Recommended value mappings:

![Grafana value mappings: Map the expected hdparm outputs to display appropriately.](./docs/grafana_value_mappings.png)

## Known limitations

The server will reply for any URL, not only `/metrics`. This is because this is
the only thing the server do and there is no point in parsing the query url to
match it.

The server will run `lsblk` and `hdparm -C` on every request. This may DDoS
your machine if an attacker spams the server. Always make sure to allow access
to this HTTP server only for authorized users.

## License

GPL-3.0-only.
