Well… All three of them
Well… All three of them
I never said I didn’t use software RAID, I just wanted to add information about hardware RAID controllers. Maybe I’m blind, but I’ve never seen a good implementation of software RAID for the EFI partition or boot sector. During boot, most systems I’ve seen will try to always access one partition directly and a second in order, which is bypassing the concept of a RAID, so the two would need to be kept manually in sync during updates.
Because of that, there’s one notable place where I won’t - I always use hardware RAID for at minimum the boot disk because Dell firmware natively understands everything about it from a detect/boot/replace perspective. Or doesn’t see anything at all in a good way. All four of my primary servers have a boot disk on either a Startech RAID card similar to a Dell BOSS or have an array to boot off of directly on the PERC. It’s only enough space to store the core OS.
Other than that, at home all my other physical devices are hypervisors (VMware ESXi for now until I can plot a migration), dedicated appliance devices (Synology DSM uses mdadm), or don’t have a redundant disks (my firewall - backed up to git, and my NUC Proxmox box, both firewalls and the PVE are all running ZFS for features).
Three of my four ESXi servers run vSAN, which is like Ceph and replaces RAID. Like Ceph and ZFS, it requires using an HBA or passthrough disks for full performance. The last one is my standalone server. Notably, ESXi does not support any software RAID natively that isn’t vSAN, so both of the standalone server’s arrays are hardware RAID.
When it comes time to replace that Synology it’s going to be on TrueNAS
For recovering hardware RAID: most guaranteed success is going to be a compatible controller with a similar enough firmware version. You might be able to find software that can stitch images back together, but that’s a long shot and requires a ton of disk space (which you might not have if it’s your biggest server)
I’ve used dozens of LSI-based RAID controllers in Dell servers (of both PERC and LSI name brand) for both work and homelab, and they usually recover the old array to the new controller pretty well, and also generally have a much lower failure rate than the drives themselves (I find myself replacing the cache battery more often than the controller itself)
Only twice out of the handful of times I went to a RAID controller from a different generation
As others have pointed out, this is where backups come into play. If you have to replace the server with one from a different generation, you run the risk that the drives won’t import. At that point, you’d have to sanitize the super block of the array and re-initialize it as a new array, then restore from backup. Now, the array might be just fine and you never notice a difference (like my users that had to replace a failed R815 with an 820), but the result pattern is really to the extremes of work or fault with no in between.
Standalone RAID controllers are usually pretty resilient and fail less often than disks, but they are very much NOT infallible as you are correct to assess. The advantage to software systems like mdadm, ZFS, and Ceph is that it removed the precise hardware compatibility requirements, but by no means does it remove the software compatible requirements - you’ll still have to do your research and make sure the new version is compatible with the old format, or make sure it’s the same version.
All that’s said, I don’t trust embedded motherboard RAIDs to the same degree that I trust standalone controllers. A friend of mine about 8-10 years ago ran a RAID-0 on a laptop that got it’s super block borked when we tried to firmware update the SSDs - stopped detecting the array at all. We did manage to recover data, but it needed multiple times the raw amount of storage to do so.
Was not expecting a direct ref to Volo’s Guide. Love it.
There’s probably specific ticket queues and wiki/doc spaces for each support team.
Problem with an app? Send it to the internal dev/support team. Then if needed it gets routed.
Or (insert MMO of choice)