Btrfs fix: partition not recognized after power loss

Problem Background

Due to an unexpected power outage during a write operation, the Btrfs partition became unrecognizable by the system and couldn't be mounted properly. Strangely, the fdisk -l command showed completely incorrect partition sizes, but the parted tool could still correctly identify this as a Btrfs partition. This was a nerve-wracking data recovery experience, and here's my complete record of the repair process for future reference.

Repair Methods Attempted

Initial Diagnosis

Before starting repairs, I first confirmed the issue: the partition table was corrupted, causing the system to incorrectly recognize partition boundaries. So fdisk showed abnormal partition size (showing 907MB instead of the actual 10+TB), while parted had better Btrfs format support and could correctly identify the partition type.

Additionally, ls --color /dev/sd* showed /dev/sda /dev/sda1. But unlike the usual colored display, /dev/sda1 did not show in color. After some research, I learned this was a characteristic of an unformatted partition. However, at this point I couldn't rashly reformat it, since even the partition size was incorrect. During this process, I also found information about users who had recovered data after accidentally deleting partitions by recreating the partitions.

Since the partition couldn't even be recognized, I was quite frustrated. After some searching, I found that the testdisk command could repair partitions, so I ran it and successfully found the partition. After confirming that the partition information was roughly correct, I wrote it to disk. I forgot to take screenshots during this process.

Using `btrfs check` for Initial Inspection

After testdisk repaired the partition table, I attempted to check the filesystem status using the btrfs check command:

Actually after testdrive fixed the partition table, I tried checking btrfs check /dev/sda1, note that this process takes a long time:

USER@DietPi:~# btrfs check /dev/sda1
Opening filesystem to check...
Checking filesystem on /dev/sda1
UUID: aaaa-bbbf7-4dd37f-b54a-aaaaaaaaa
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
[4/8] checking free space tree
[5/8] checking fs roots
[6/8] checking only csums items (without verifying data)
[7/8] checking root refs
[8/8] checking quota groups skipped (not enabled on this FS)
found 3720699846656 bytes used, no error found
total csum bytes: 3627674452
total tree bytes: 4493639680
total fs tree bytes: 113491968
total extent tree bytes: 62062592
btree space waste bytes: 666159899
file data blocks allocated: 94225876987904
 referenced 3714605076480

Warning About the Dangerous `btrfs check --repair` Command

I then attempted to run the btrfs check --repair /dev/sda1 command, but immediately encountered a serious warning indicating this command was very dangerous. I quickly researched this command and found that btrfs check --repair is indeed extremely risky and could cause data corruption. After running it for a period of time, I interrupted the operation and decided not to use this command.

USER@DietPi:~# btrfs check --repair /dev/sda1
enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/sda1
UUID: a023795c-b8f7-437f-b54a-ea2f19c79391
[1/8] checking log skipped (none written)
[2/8] checking root items
Fixed 0 roots.
[3/8] checking extents
super bytes used 3720700157952 mismatches actual used 3720699846656
^C

Trying the `btrfs rescue` Suite of Commands

Next, I tried more btrfs rescue operations, but the btrfs rescue chunk-recover command took too long so I terminated it:

USER@DietPi:~# btrfs rescue super-recover /dev/sda1
All supers are valid, no need to recover
USER@DietPi:~# btrfs rescue chunk-recover /dev/sda1
Scanning: 46471950336 in dev0^C
USER@DietPi:~# btrfs rescue fix-device-size /dev/sda1
No device size related problem found
USER@DietPi:~# btrfs rescue chunk-recover /dev/sda1
Scanning: 6281363456 in dev0^C

Using `btrfs-find-root` to Explore Root Nodes

I also tried using the btrfs-find-root command to explore Btrfs filesystem root node information:

USER@DietPi:~# btrfs-find-root /dev/sda1
Superblock thinks the generation is 6205
Superblock thinks the level is 0
Found tree root at 3784970452992 gen 6205 level 0
Well block 3784970338304(gen: 6204 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0
Well block 2994557911040(gen: 6131 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0
Well block 2994302287872(gen: 6130 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0
Well block 2994024235008(gen: 6129 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0
Well block 2993854611456(gen: 6128 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0
Well block 1865050963968(gen: 6127 level: 0) seems good, but generation/level doesn't match, want gen: 6205 level: 0

Final Solution: Using `btrfsck`

After the above multiple attempts, I finally used the btrfsck command successfully to complete the repair task, note that this process takes a very long time:

USER@DietPi:~# btrfsck /dev/sda1
Opening filesystem to check...
Checking filesystem on /dev/sda1
UUID: aaaa-bbbf7-4dd37f-b54a-aaaaaaaaa
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
[4/8] checking free space tree
[5/8] checking fs roots
[6/8] checking only csums items (without verifying data)
[7/8] checking root refs
[8/8] checking quota groups skipped (not enabled on this FS)
found 3720700157952 bytes used, no error found
total csum bytes: 3627674452
total tree bytes: 4493950976
total fs tree bytes: 113491968
total extent tree bytes: 62078976
btree space waste bytes: 666243990
file data blocks allocated: 94225876987904
 referenced 3714605076480

Pre-Repair Phenomenon

Before the repair, the most obvious problem was that the fdisk command displayed incorrect partition sizes, while the parted command could display them correctly, helping me identify the issue:

USER@DietPi:~# fdisk -l /dev/sda
Disk /dev/sda: 10.91 TiB, 12000138625024 bytes, 23437770752 sectors
Disk model: 8618sus3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 48aaaa5C-61EE-4Aa14-8b7FA-2DCdddd158FA

Device     Start         End     Sectors  Size Type
/dev/sda1   2048 23437768703 23437766656 10.9T Linux filesystem
USER@DietPi:~# fdisk -l /dev/sda1
Disk /dev/sda1: 907.56 MiB, 951648256 bytes, 1858688 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
USER@DietPi:~# parted /dev/sda
GNU Parted 3.6
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: orico 8618sus3 (scsi)
Disk /dev/sda: 12.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  12.0TB  12.0TB  btrfs        primary

As you can see, fdisk -l /dev/sda1 showed a partition size of only 907.56 MiB, far smaller than the actual 10.9T, while parted correctly showed 12.0TB, further confirming the partition table damage hypothesis.

Post-Repair Verification

After a series of repair operations, the final verification results were as follows:

USER@DietPi:/tmp# fdisk -l /dev/sda
Disk /dev/sda: 10.91 TiB, 12000138625024 bytes, 23437770752 sectors
Disk model: 8618sus3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 48aaaa5C-61EE-4Aa14-8b7FA-2DCdddd158FA

Device     Start         End     Sectors  Size Type
/dev/sda1   2048 23437768703 23437766656 10.9T Linux filesystem
USER@DietPi:/tmp# fdisk -l /dev/sda1
Disk /dev/sda1: 10.91 TiB, 12000136527872 bytes, 23437766656 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Repair Insights

This repair experience gave me a deeper understanding of the Btrfs filesystem. Through this operation, I successfully resolved the Btrfs filesystem issue caused by unexpected power loss. The entire process was tense but rewarding.

The key steps in the entire repair process were:

First, used the testdisk tool to repair the partition table, resolving the issue of unrecognized partitions
Then used the btrfsck tool to complete the filesystem inspection and repair
Finally, the partition size display returned to normal and could be mounted properly

Important Lessons Learned

This experience taught me several important lessons:

Data Backup: When performing any filesystem repair operations, if possible, prioritize backing up important data first
Avoid Dangerous Commands: As mentioned above, the btrfs check --repair command is very dangerous and should be avoided as much as possible
Tool Selection: For Btrfs filesystems, it's recommended to use officially recommended rescue tools, such as the btrfs rescue suite of commands
Verify Results: After repairs are completed, be sure to verify partition size and filesystem integrity
Preventive Measures: To reduce similar issues occurring, consider using UPS power supplies to avoid unexpected power outages

Tool Usage Experience

During this repair process, I used multiple tools, each with its own characteristics:

testdisk: A powerful partition table repair tool supporting multiple partition formats, very useful when partition tables are damaged
btrfsck: Btrfs filesystem inspection tool for detecting and repairing Btrfs filesystem errors
parted: A partition editing tool that typically recognizes partition information more accurately than fdisk, especially when there are issues with the partition table
btrfs rescue: Btrfs dedicated rescue toolkit containing multiple specialized repair commands

Root Cause Analysis

Looking back at the root of this issue, it was mainly due to the Copy-on-Write (COW) characteristics of the Btrfs filesystem and metadata inconsistency caused by the unexpected power outage. When Btrfs writes data, it updates metadata, and if power is lost during this process, it can cause inconsistencies in metadata such as superblocks, chunk information, or checksums, which in turn affects the accessibility of the entire filesystem. This also reminds me to pay attention to power stability when handling important data.

mzhboy/btrfs fix after power loss.md

Select an option

No results found

Select an option

No results found

Btrfs fix: partition not recognized after power loss

Problem Background

Repair Methods Attempted

Initial Diagnosis

Using `btrfs check` for Initial Inspection

Warning About the Dangerous `btrfs check --repair` Command

Trying the `btrfs rescue` Suite of Commands

Using `btrfs-find-root` to Explore Root Nodes

Final Solution: Using `btrfsck`

Pre-Repair Phenomenon

Post-Repair Verification

Repair Insights

Important Lessons Learned

Tool Usage Experience

Root Cause Analysis

mzhboy/btrfs fix after power loss.md

Btrfs fix: partition not recognized after power loss

Problem Background

Repair Methods Attempted

Initial Diagnosis

Using btrfs check for Initial Inspection

Warning About the Dangerous btrfs check --repair Command

Trying the btrfs rescue Suite of Commands

Using btrfs-find-root to Explore Root Nodes

Final Solution: Using btrfsck

Pre-Repair Phenomenon

Post-Repair Verification

Repair Insights

Important Lessons Learned

Tool Usage Experience

Root Cause Analysis

Using `btrfs check` for Initial Inspection

Warning About the Dangerous `btrfs check --repair` Command

Trying the `btrfs rescue` Suite of Commands

Using `btrfs-find-root` to Explore Root Nodes

Final Solution: Using `btrfsck`