Intro
Almost a year ago I started a discussion about ZFS on YunoHost, so this is in a way a continuation of that thinking from my PoV, but as for others finding this thread it would be of different value, I decided to make it a separate topic.
If this topic comes to a good conclusion, I’d be happy to turn it into a Wiki or other form of documentation later on. But I am far from a file system expert and at the time of this first post a bit in panic mode.
Available hardware
TL;DR:
- 1 × M.2
- 4 × SATA
- 32 GB RAM
- capable amd64 CPU
The system I have is still the same as in the linked topic above (AMD Ryzen 5 4600G with 32 GB of RAM in a MiniATX), with a notable difference.
I accidentally managed to grind my system SSD (NVME M.2) in less than a year to the point where it is failing me now. Which is why I’m now very quickly considering a new disk/partition set-up.
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-29-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 990 PRO 1TB
Serial Number: S6Z1NJ0W732771H
Firmware Version: 3B2QJXD7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 2.0
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization: 988,697,219,072 [988 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 4731411541
Local Time is: Sun Nov 24 13:20:15 2024 CET
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0055): Comp DS_Mngmt Sav/Sel_Feat Timestmp
Log Page Attributes (0x2f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg *Other*
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 9.39W - - 0 0 0 0 0 0
1 + 9.39W - - 1 1 1 1 0 200
2 + 9.39W - - 2 2 2 2 0 1000
3 - 0.0400W - - 3 3 3 3 2000 1200
4 - 0.0050W - - 4 4 4 4 500 9500
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
- NVM subsystem reliability has been degraded
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x04
Temperature: 61 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 106%
Data Units Read: 4,227,717 [2.16 TB]
Data Units Written: 3,845,028,947 [1.96 PB]
Host Read Commands: 96,669,116
Host Write Commands: 7,514,585,410
Controller Busy Time: 17,898
Power Cycles: 24
Power On Hours: 4,100
Unsafe Shutdowns: 20
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 61 Celsius
Temperature Sensor 2: 67 Celsius
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged
This text will be hidden
In addition to that dying SSD on 1 × M.2 NVMe I have 4 × SATA slots (6, according to dmesg
, but motherboard manual disagrees). There are also several PCIe slots available, so if needed I could buy an expansion card (e.g. for more M.2 NVMe). I am somewhat limited by the format of my Mini ITX case though.
From the 4 × SATA drives, I (currently plan to) have:
- 2 × 1 TB HDD (WD Red 2.5") on SATA – these I already have and although they are 5 and 7 years old, respectively, they are in much better condition than the NVME SSD.
- planning to buy 2 × 4 TB HDD (1 × WD Red Plus, 1 × Seagate IronWolf) to put into the
- (just in case there is also 1 × 500 GB SSD (Crucial) on SATA lying around)
- (another M.2 NVMe SSD eventually)
Disk layout and file system
Idea A: 2 × SSD in Btrfs RAID1 + 2 × HDD in Btrfs RAID1
Put two SSD into Btrfs RAID1 in order to easily replace a dead or dying drive:
- 1 × SSD on M.2 NMVe
- 1 × SSD on SATA
and the put the two HDD into Btrfs RAID1 too, for the same reason:
- 2 × HDD on SATA
The idea here is that what needs to be fast would run on the SSD (pair), while what is fine to run on an HDD would be moved there.
In addition I would put my laptops’ (Borg)backups on those 2×HDD too.
(Ideally, sometime down the line I would then also get a disk (pair) at a different location and send Btrfs snapshots there; but I’d need to think what makes sense to send via internet.)
Idea B: (1 × SSD backed up with Brtfs snapshots on 1 × SSD/HDD) + 2 × HDD in Btrfs RAID1
This one is a bit different as it relies on snapshot “backups“ instead of RAID for quick recovery from a failed drive.
- put on Btrfs on the M.2 NVMe SSD and mount that as
/
- put Btrfs on the other SSD (or HDD) connected via SATA
- make regular (hourly?) Btrfs Snapshots from the M.2 to the SATA SSD
If at one point the first M.2 SSD dies, simply mount the SATA SSD instead.
For the other 2 × HDD the idea is the same as above.
The idea here is that what needs to be fast would run on the SSD (pair), while what is fine to run on an HDD would be moved there.
In addition I would put my laptops’ (Borg)backups on those 2×HDD too.
(Ideally, sometime down the line I would then also get a disk (pair) at a different location and send Btrfs snapshots there; but I’d need to think what makes sense to send via internet.)
Idea C: 1 × SDD + 3 × HDD in RAID10 (or 2 × SSD + 2 × HDD in RAID10)
Very simply, the idea is to put everything into a single Btrfs RAID10 and get the best of both worlds as (hopefully) Btrfs would:
- first write to and read from the SSD, so that’s the speed benefit
- when SSD fails, automatically fall back to HDD
I have a suspicion this is asking Btrfs a bit too much, but if it’s not, that might be a cool solution.
I also suspect that this would still burn through the SSD just as fast.
Sub-Idea Z: Same as something above, but with hardware RAID
According to the manual my motherboard supports (hardware) RAID0, RAID1 and RAID10 which I can set-up through the BIOS.
On one hand that sounds very easy to set up, on the other, I am concerned how things work in hardware-RAID-land, how do I deal with a dead drive and what happens when I eventually, but inevitably have to move these drives into a new motherboard.
SSD or HDD as the main drive?
With the physical drives out of the way, the question is how exactly the mount points should be divided between these drives.
YunoHost already has great documentation as a start, but I’d like to discuss this further here (and eventually update the documentation if applicable).
Option 1: SSD as main drive
The idea here is simply to have the SSD (pair) as the main drive, mounted to /
, and put on the HDD (pair) only the mountpoints that would hurt the SSD.
Option 2: HDD as main drive
Essentially similar, but the approach is from the other way around.
Put everything on the HDD (pair), mounted to /
, and put on the SSD (pair) only the mountpoints that would specifically benefit to be on a faster drive.
(Silly idea: mount overlaying)
I don’t know if this is a thing, but since you can (accidentally) “overlay a mount”, I thought maybe that would be something interesting to take advantage of.
Here’s what I mean:
- have everything on a HDD, mount it to
/
- copy parts (e.g. stuff that needs to go fast) of it to SSD, and mount that to relevant mount points – e.g.
/opt
,/var/www
,/var/lib/postgresql
,/var/lib/mysql
, … - “back up” (e.g. with Btrfs snapshots or similar) regularly things on the SSD to the HDD
- if SSD dies, just unmount/unplug it, as everything is already on the HDD too anyway
I have a hard hunch this is a stupid idea, but in case it isn’t, I’d love to hear about it more.