There is a recent regression which will be resolved in an upcoming release which causes NBD connections to be opened with the non optimal cfq scheduler which has been known in some circumstances to cause NBD timeouts and in turn degraded disks.
We can check the status of existing devices on each HV:
This is correct:
noop anticipatory [deadline] cfq
This is wrong:
noop anticipatory deadline [cfq]
We can set all NBD devices on a HV to use the deadline scheduler with the following which can be done on the fly:
for d in /sys/block/nbd*/queue/scheduler; do echo deadline > $d ; done
for d in /sys/block/sd[a-z]/queue/scheduler; do echo deadline > $d ; done
We would suggest to apply this change as a precaution even if you are not currently seeing any of the issues described above.