Urgent YunoHost Server Migration: Replacing a Failing NVMe SSD

lordmat · May 21, 2025, 11:40am

What type of hardware are you using: Old laptop or computer
What YunoHost version are you running: 12.1.6.1
How are you able to access your server: The webadmin
SSH
Direct access via physical keyboard/screen

Describe your issue

Hey everyone,

My YunoHost server, has been experiencing issues, which I’ve traced back to a failing NVMe SSD (/dev/nvme0). My logs show a continuous increase in smartd error log entries, indicating a critical disk health issue. I’ve already confirmed this with smartctl output, which shows a very high number of Error Information Log Entries.

I’ve decided to replace the failing drive with a new Samsung 970 EVO Plus 1 TB NVMe M.2 SSD (PCIe 3.0), which I plan to install next week. This new SSD should provide excellent performance and reliability.

Now, my main concern is how to safely and efficiently migrate my existing YunoHost system to the new drive. Given that the current SSD is in a critical state, a smooth migration with minimal data loss is paramount.

My key questions are:

What is the recommended best practice for backing up a running YunoHost instance when the underlying drive is failing? Should I use YunoHost’s built-in backup system, or are there more robust low-level disk imaging tools that might be better suited for a failing drive?
What is the step-by-step process for migrating a YunoHost installation to a new, larger drive? Are there specific YunoHost procedures or best practices for this scenario (e.g., re-installing YunoHost first, then restoring, or cloning the old disk)?

I’m looking for a concrete path, ideally with specific commands or YunoHost-centric guidance, to ensure a successful transition.

My Proposed (High-Level) Plan:

Backup YunoHost: Perform a full system backup using YunoHost’s built-in tools. I’d ideally like to know if there are any specific options or considerations when backing up from a failing drive.
Physical Replacement: Shut down the server, replace the old /dev/nvme0 with the new Samsung 970 EVO Plus 1TB.
OS Installation: Install a fresh copy of Debian (the base OS for YunoHost) on the new 1TB SSD.
YunoHost Installation: Install YunoHost on the fresh Debian system.
Restore Backup: Restore the YunoHost backup onto the newly installed YunoHost instance.
Specific Concerns:
Integrity of the backup: How to ensure the backup itself isn’t corrupted by the failing drive?
Disk partitioning: Are there recommended partitioning schemes for YunoHost on a new, larger NVMe drive?
Downtime: I expect some downtime, but would like to minimize it.

Thanks in advance for your help!

Share relevant logs or error messages

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-35-amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: LENSE30512GMSP34MEAT3TA
Serial Number: FBFB180914R0007305
Firmware Version: 2.5.0412
PCI Vendor/Subsystem ID: 0x17aa
IEEE OUI Identifier: 0xa03299
Controller ID: 1
NVMe Version: 1.2
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512.110.190.592 [512 GB]
Namespace 1 Utilization: 0
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: a03299 b6f8165000
Local Time is: Wed May 21 13:16:17 2025 CEST
Firmware Updates (0x02): 1 Slot
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0016): Wr_Unc DS_Mngmt Sav/Sel_Feat
Maximum Data Transfer Size: 32 Pages
Warning Comp. Temp. Threshold: 70 Celsius
Critical Comp. Temp. Threshold: 80 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.50W - - 0 0 0 0 0 0
1 + 4.60W - - 1 1 1 1 5 5
2 + 3.90W - - 2 2 2 2 5 5
3 - 0.1000W - - 3 3 3 3 60 1000
4 - 0.0100W - - 4 4 4 4 30000 1500

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 - 512 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 33 Celsius
Available Spare: 100%
Available Spare Threshold: 3%
Percentage Used: 37%
Data Units Read: 60.682.988 [31,0 TB]
Data Units Written: 105.184.538 [53,8 TB]
Host Read Commands: 901.632.237
Host Write Commands: 2.462.369.213
Controller Busy Time: 18.529
Power Cycles: 1.198
Power On Hours: 18.255
Unsafe Shutdowns: 180
Media and Data Integrity Errors: 0
Error Information Log Entries: 13.618
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 41 Celsius
Temperature Sensor 2: 33 Celsius

Error Information (NVMe Log 0x01, 4 of 4 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
0 13618 0 0xb013 0xc005 0x000 0 0 -
1 13617 0 0x9010 0xc004 0x000 0 0 -
2 13616 0 0xa01b 0xc004 0x000 0 0 -
3 13615 0 0xa01a 0xc005 0x000 0 0 -

jarod5001 · May 21, 2025, 9:06pm

Your plan is correct.
But : try to mount /home/yunohost.backup on an external drive so your backup is not at risk of corruption, you should really launch the backup as soon as possible. Making a failing drive to work will increase the failures and may lead to an unusable device. Install the new drive, install yunohost and do not run the post install. Mount the backup drive or copy the backup to the /home/yunohost.backup folder. Run a restore from command line. That’s it.
SSD disks have the tendency to fail faster than HDD. My SSD drive stopped responding hours after smart status began to yell.
And make regular backups on different locations, preferably on HDD.

system · June 20, 2025, 9:07pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.