bruvv/Repair synology BTRFS volume

driehuis · 2020-06-01T23:49:58Z

If btrfs check --repair errors out with "couldn't open RDWR because of unsupported option features", you can try clearing the space cache with btrfs check --clear-space-cache v2 or btrfs check --clear-space-cache v1, and retrying the btrfs check --repair.

YMMV. If you break it, you own the pieces, and by the time you need --repair chances are that it broke beyond repair already :-)

DJGIG83 · 2021-01-28T08:48:25Z

Hey i can not do the btrfs repair couse of that 👍 root@Mediaserver:~# btrfs check --clear-space-cache v2 /dev/vg2/volume_2
Syno caseless feature on.
Clear free space cache v2
parent transid verify failed on 1456623927296 wanted 43243 found 41147
parent transid verify failed on 1456623927296 wanted 43243 found 41147
parent transid verify failed on 1456623927296 wanted 43243 found 41147
parent transid verify failed on 1456623927296 wanted 43243 found 41147
Ignoring transid failure
parent transid verify failed on 2113744257024 wanted 41691 found 38109
parent transid verify failed on 2113744257024 wanted 41691 found 38109
parent transid verify failed on 3351185145856 wanted 42474 found 10785
parent transid verify failed on 3351185145856 wanted 42474 found 10785
leaf parent key incorrect 1456561242112
ERROR: failed to clear free space cache v2: -1
btrfs: transaction.h:41: btrfs_start_transaction: Assertion `!(root->commit_root)' failed.
Aborted (core dumped)
what should i do ? can you help me ?

Saopanda · 2021-08-14T13:11:04Z

If btrfs check --repair errors out with "couldn't open RDWR because of unsupported option features", you can try clearing the space cache with btrfs check --clear-space-cache v2 or btrfs check --clear-space-cache v1, and retrying the btrfs check --repair.

YMMV. If you break it, you own the pieces, and by the time you need --repair chances are that it broke beyond repair already :-)

👍 ah!!!!! thanks a lot !

mbwmbw1337 · 2022-01-06T18:38:17Z

This is an excellent gist. Thank you!

solidfox · 2023-08-15T15:55:40Z

syno_poweroff_task -d

is replaced by this in DSM 7:

sudo synostgvolume --unmount -p volume1

But that shuts down ssh so… working on that.

NewFolk · 2023-10-28T10:37:33Z

Hi everyone!

I find this interesting guide for EXT4.

This part could be interesting for unmount

synosetkeyvalue /etc/synoinfo.conf disable_volumes volume1

devops365 · 2024-02-26T18:03:48Z

Just as a quick guide for most synology running on 7.x version and with just one volume with brtfs , jsut run the below commands

unmount forcefully the read only partition

umount -f -k /volume1

clear the cache on brtfs so it can then repair

btrfs check --clear-space-cache v2 /dev/mapper/cachedev_0

#run the repair and wait for it to complete
btrfs check --repair /dev/mapper/cachedev_0

reboot the machine and it should be back in 2 minutes.

reboot

after it has rebooted, execute immediately a data scrub on the volume !

ThoBekMic · 2024-03-04T09:13:25Z

If btrfs check --repair errors out with "couldn't open RDWR because of unsupported option features", you can try clearing the space cache with btrfs check --clear-space-cache v2 or btrfs check --clear-space-cache v1, and retrying the btrfs check --repair.

YMMV. If you break it, you own the pieces, and by the time you need --repair chances are that it broke beyond repair already :-)

I've been battling this issue for literally weeks. Clearing the space caches and doing a repair fixed my issue with no data loss!

Thank you SO much!

Invisi · 2024-05-20T00:00:09Z

Greetings from 2024 with a DS1819+ running DSM 7.2.1-69057 Update 3.
Had some corrupted leaves and managed to get btrfs check /dev/mapper/cachedev_0 running by unmounting via synostgvolume --unmount -p /volume1 as mentioned in this xpenology thread and above. It did not shut down SSH for me.

Additionally, I couldn't run btrfs on /dev/vg1000/lv or /dev/md2, only /dev/mapper/cachedev_0 worked properly for unknown reasons. Good luck to you all.

eduarcor · 2025-05-31T16:07:43Z

Hello guys,

Hope you could help with a problem I am having in my NAS. Have the feeling is related to this gist theme.

TLTR: When I try to do a manual data scrubbing, after several hours it aborts. I don’t know the reason and how to solve it!!

First a little bit of context. I am running an xpenology with DSM 7.2.2 (last version) in SA6400 platform (i5 12400, Asrock H610M-ITX/ac board, 32Gb DDR4) with Arc Loader 2.4.2, I have RAID 6 with 8 x 8Tb at 62% of capacity. Being running xpenology for many years with no problem, starting from a RAID 5 with 5 x 8Tb, which I had to change several times faulty drives with new ones, and reconstructing the RAID, etc... Always successfully.

Now. When I try to do a manual data scrubbing, after several hours it aborts.

The message in Notifications is:

The system was unable to run data scrubbing on Storage Pool 1. Please go to Storage Manager and check if the volumes belonging to this storage pool are in a healthy status.

But the Volume health status is healthy!! No errors whatsoever... runned smart tests (quick), healthy status. Even having 3 Ironwolfs disks, I did Ironwolf tests with no errors either, showing all of them being in healthy condition.

In Notifications, a system even indicated:

Files with checksum mismatch have been detected on a volume. Please go to Log Center and check the file paths of the files with errors and try to restore the files with backed up files.

This happened while performing the data scrubbing, 2 files had errors: one belonging a metadata file of database inside a plex docker container. And other was an old video file.

As there were no other reason why the data scrubbing aborted, I typed these commands in ssh:

> btrfs scrub status -d /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Wed May 28 21:02:50 2025 and was aborted after 03:50:45
total bytes scrubbed: 13.32TiB with 2 errors
error details: csum=2
corrected errors: 0, uncorrectable errors: 2, unverified errors: 0

> btrfs scrub status -d -R /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Wed May 28 21:02:50 2025 and was aborted after 03:50:45
data_extents_scrubbed: 223376488
tree_extents_scrubbed: 3407534
data_bytes_scrubbed: 14586949533696
tree_bytes_scrubbed: 55829037056
read_errors: 0
csum_errors: 2
verify_errors: 0
no_csum: 2449
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 2
unverified_errors: 0
corrected_errors: 0
last_physical: 15662894481408

It looks like it aborted after almost 4 hours and 13.32TiB of scrubbing (of a total of 25.8TiB used in the Volume).

As per the result of the checksum errors, I ran a memtest. I have 2x16Gb DDR4 memory. It found errors. I removed one of the sticks, and kept the other, and ran memtest again. It didn't error out so I am now having just 16Gb of RAM, but allegedly with no errors.

Then I removed the 2 files that were corrupted (I don't care about them), just in case it was aborting the scrubbing because of them, as a kind reddit user told me it could be the case.

And I ran data scrubbing again, having exactly the same message Notifications (DSM is so bad, not showing the cause of it). Now there are no messages at all of any checksum mismatch.

The result of the commands are pretty similar:

> btrfs scrub status -d /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Thu May 29 02:41:33 2025 and was aborted after 03:50:40
total bytes scrubbed: 13.32TiB with 1 errors
error details: csum=1
corrected errors: 0, uncorrectable errors: 1, unverified errors: 0

> btrfs scrub status -d -R /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Thu May 29 02:41:33 2025 and was aborted after 03:50:40
data_extents_scrubbed: 223374923
tree_extents_scrubbed: 3407378
data_bytes_scrubbed: 14586854449152
tree_bytes_scrubbed: 55826481152
read_errors: 0
csum_errors: 1
verify_errors: 0
no_csum: 2449
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 1
unverified_errors: 0
corrected_errors: 0
last_physical: 15662894481408

Before it ran during 3:50:45, and now 3:50:40, which is quite similar, almost 4 hours.

Now it says 1 error, despite I deleted the 2 files and is not informing about any file checksum error now in the Notifications nor the Log Center.

I have no clue why is aborting. I guess the data scrubbing process should finish the whole volume and inform of any file with any problem if it is the case.

I am very concern as in the case of a hard drive failure, the process of reconstructing of the RAID 6 (I have 2 drives tolerance), does a data scrubbing and if I am not able to run the scrubbing, then I will loose the data.

Is curious, but the system is working flawlessly otherwise. I was not having any problem, except this data scrubbing not working right now.

I will have to leave my home until next week, and will not be able to perform more test in a week. But just wanted to share this asap and try to make this thing work again, as I am a freaking out to be honest.

Thanks guys in advance.

DJGIG83 · 2025-06-02T09:39:02Z

Hy Delete the 2 files of Metadata in the plex directory and try again . It should work 👍. When its finished update the Plex Library. Am 31.05.2025 um 18:08 schrieb eduarcor ***@***.***>:Re: bruvv/Repair synology BTRFS ***@***.*** commented on this gist.Hello guys,Hope you could help with a problem I am having in my NAS.First a little bit of context. I am running an xpenology with DSM 7.2.2 (last version) in SA6400 platform (i5 12400, Asrock H610M-ITX/ac board, 32Gb DDR4) with Arc Loader 2.4.2, I have RAID 6 with 8 x 8Tb at 62% of capacity. Being running xpenology for many years with no problem, starting from a RAID 5 with 5 x 8Tb, which I had to change several times faulty drives with new ones, and reconstructing the RAID, etc... Always successfully.Now. When I try to do a manual data scrubbing, after several hours it aborts.The message in Notifications is:The system was unable to run data scrubbing on Storage Pool 1. Please go to Storage Manager and check if the volumes belonging to this storage pool are in a healthy status.But the Volume health status is healthy!! No errors whatsoever... runned smart tests (quick), healthy status. Even having 3 Ironwolfs disks, I did Ironwolf tests with no errors either, showing all of them being in healthy condition.In Notifications, a system even indicated:Files with checksum mismatch have been detected on a volume. Please go to Log Center and check the file paths of the files with errors and try to restore the files with backed up files.This happened while performing the data scrubbing, 2 files had errors: one belonging a metadata file of database inside a plex docker container. And other was an old video file.As there were no other reason why the data scrubbing aborted, I typed these commands in ssh:btrfs scrub status -d /volume1scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437scrub device /dev/mapper/cachedev_0 (id 1) historyscrub started at Wed May 28 21:02:50 2025 and was aborted after 03:50:45total bytes scrubbed: 13.32TiB with 2 errorserror details: csum=2corrected errors: 0, uncorrectable errors: 2, unverified errors: 0> btrfs scrub status -d -R /volume1 scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437 scrub device /dev/mapper/cachedev_0 (id 1) history scrub started at Wed May 28 21:02:50 2025 and was aborted after 03:50:45 data_extents_scrubbed: 223376488 tree_extents_scrubbed: 3407534 data_bytes_scrubbed: 14586949533696 tree_bytes_scrubbed: 55829037056 read_errors: 0 csum_errors: 2 verify_errors: 0 no_csum: 2449 csum_discards: 0 super_errors: 0 malloc_errors: 0 uncorrectable_errors: 2 unverified_errors: 0 corrected_errors: 0 last_physical: 15662894481408 It looks like it aborted after almost 4 hours and 13.32TiB of scrubbing (of a total of 25.8TiB used in the Volume).As per the result of the checksum errors, I ran a memtest. I have 2x16Gb DDR4 memory. It found errors. I removed one of the sticks, and kept the other, and ran memtest again. It didn't error out so I am now having just 16Gb of RAM, but allegedly with no errors.Then I removed the 2 files that were corrupted (I don't care about them), just in case it was aborting the scrubbing because of them, as a kind reddit user told me it could be the case (thanks u/wallacebrf).And I ran data scrubbing again, having exactly the same message Notifications (DSM is so bad, not showing the cause of it). Now there are no messages at all of any checksum mismatch.The result of the commands are pretty similar:btrfs scrub status -d /volume1scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437scrub device /dev/mapper/cachedev_0 (id 1) historyscrub started at Thu May 29 02:41:33 2025 and was aborted after 03:50:40total bytes scrubbed: 13.32TiB with 1 errorserror details: csum=1corrected errors: 0, uncorrectable errors: 1, unverified errors: 0> btrfs scrub status -d -R /volume1 scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437 scrub device /dev/mapper/cachedev_0 (id 1) history scrub started at Thu May 29 02:41:33 2025 and was aborted after 03:50:40 data_extents_scrubbed: 223374923 tree_extents_scrubbed: 3407378 data_bytes_scrubbed: 14586854449152 tree_bytes_scrubbed: 55826481152 read_errors: 0 csum_errors: 1 verify_errors: 0 no_csum: 2449 csum_discards: 0 super_errors: 0 malloc_errors: 0 uncorrectable_errors: 1 unverified_errors: 0 corrected_errors: 0 last_physical: 15662894481408 Before it ran during 3:50:45, and now 3:50:40, which is quite similar, almost 4 hours.Now it says 1 error, despite I deleted the 2 files and is not informing about any file checksum error now in the Notifications nor the Log Center.I have no clue why is aborting. I guess the data scrubbing process should finish the whole volume and inform of any file with any problem if it is the case.I am very concern as in the case of a hard drive failure, the process of reconstructing of the RAID 6 (I have 2 drives tolerance), does a data scrubbing and if I am not able to run the scrubbing, then I will loose the data.Is curious, but the system is working flawlessly otherwise. I was not having any problem, except this data scrubbing not working right now.I will have to leave my home until next week, and will not be able to perform more test in a week. But just wanted to share this asap and try to make this thing work again, as I am a freaking out to be honest.Thanks guys in advance.—Reply to this email directly, view it on GitHub or unsubscribe.You are receiving this email because you commented on the thread.Triage notifications on the go with GitHub Mobile for iOS or Android.

eduarcor · 2025-06-02T16:41:47Z

Hy Delete the 2 files of Metadata in the plex directory and try again . It should work 👍. When its finished update the Plex Library.

Jeez, replying to my message via email, made your response pretty bad to read in the thread. Please read and reply in gist instead. Doing from the mail, does not work properly.
Coming back to your response, I think you haven’t read my post properly. See what I wrote:

Then I removed the 2 files that were corrupted (I don't care about them), just in case it was aborting the scrubbing because of them, as a kind reddit user told me it could be the case.

As you can see, I already did and didn’t work as when I repeated the scrubbing it failed the same way…

Northguy · 2025-06-02T22:03:22Z

@eduarcor do you have snapshots enabled on those two files? Might be that the snapshots are interfering?

eduarcor · 2025-06-03T09:31:35Z

I don’t have any snapshot. I am not using snapshots

eduarcor · 2025-06-11T01:52:25Z

Tried again:

> btrfs scrub status -d /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Tue Jun 10 23:36:50 2025 and was aborted after 03:57:18
total bytes scrubbed: 13.32TiB with 1 errors
error details: csum=1
corrected errors: 0, uncorrectable errors: 1, unverified errors: 0

> btrfs scrub status -d -R /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Tue Jun 10 23:36:50 2025 and was aborted after 03:57:18
data_extents_scrubbed: 223444115
tree_extents_scrubbed: 3395356
data_bytes_scrubbed: 14590691647488
tree_bytes_scrubbed: 55629512704
read_errors: 0
csum_errors: 1
verify_errors: 0
no_csum: 2450
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 1
unverified_errors: 0
corrected_errors: 0
last_physical: 15662894481408

And again,

> btrfs scrub status -d /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Wed Jun 11 03:56:46 2025 and was aborted after 03:40:22
total bytes scrubbed: 13.32TiB with 1 errors
error details: csum=1
corrected errors: 0, uncorrectable errors: 1, unverified errors: 0

> btrfs scrub status -d -R /volume1
scrub status for 98dcebd8-a24e-4d16-b7d1-90917471e437
scrub device /dev/mapper/cachedev_0 (id 1) history
scrub started at Wed Jun 11 03:56:46 2025 and was aborted after 03:40:22
data_extents_scrubbed: 223443880
tree_extents_scrubbed: 3395228
data_bytes_scrubbed: 14590688092160
tree_bytes_scrubbed: 55627415552
read_errors: 0
csum_errors: 1
verify_errors: 0
no_csum: 2450
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 1
unverified_errors: 0
corrected_errors: 0
last_physical: 15662894481408

Sadly, same thing. No logs in the log center. No info. No help. Nothing.

	btrfs fi show -d
	(/dev/mapper/vg1000-lv)

	syno_poweroff_task -d

	(or: umount /volume1)
	(or2: umount /volume1 -f -k)

	Check to see if all us unmounted:
	df -h

	mdadm --stop /dev/vg1000/lv

	btrfsck /dev/vg1000/lv

	btrfs check --repair /dev/vg1000/lv

	btrfs rescue super-recover -v /dev/vg1000/lv

	vgchange -ay

	e2fsck -nvf -C 0 /dev/vg1000/lv

	fsck.ext4 -pvf -C 0 /dev/vg1000/lv

	(or: e2fsck -pvf -C 0 /dev/vg1000/lv -C O)
	(do not do this: -C fd)

bruvv/Repair synology BTRFS volume

driehuis commented Jun 1, 2020

Uh oh!

DJGIG83 commented Jan 28, 2021

Uh oh!

Saopanda commented Aug 14, 2021

Uh oh!

mbwmbw1337 commented Jan 6, 2022

Uh oh!

solidfox commented Aug 15, 2023

Uh oh!

NewFolk commented Oct 28, 2023

Uh oh!

devops365 commented Feb 26, 2024

Uh oh!

ThoBekMic commented Mar 4, 2024

Uh oh!

Invisi commented May 20, 2024

Uh oh!

eduarcor commented May 31, 2025 •

edited

Loading

Uh oh!

DJGIG83 commented Jun 2, 2025 via email

Uh oh!

eduarcor commented Jun 2, 2025 •

edited

Loading

Uh oh!

Northguy commented Jun 2, 2025

Uh oh!

eduarcor commented Jun 3, 2025 •

edited

Loading

Uh oh!

eduarcor commented Jun 11, 2025 •

edited

Loading

Uh oh!

bruvv/Repair synology BTRFS volume

driehuis commented Jun 1, 2020

Uh oh!

DJGIG83 commented Jan 28, 2021

Uh oh!

Saopanda commented Aug 14, 2021

Uh oh!

mbwmbw1337 commented Jan 6, 2022

Uh oh!

solidfox commented Aug 15, 2023

Uh oh!

NewFolk commented Oct 28, 2023

Uh oh!

devops365 commented Feb 26, 2024

unmount forcefully the read only partition

clear the cache on brtfs so it can then repair

reboot the machine and it should be back in 2 minutes.

Uh oh!

ThoBekMic commented Mar 4, 2024

Uh oh!

Invisi commented May 20, 2024

Uh oh!

eduarcor commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DJGIG83 commented Jun 2, 2025 via email

Uh oh!

eduarcor commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Northguy commented Jun 2, 2025

Uh oh!

eduarcor commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eduarcor commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eduarcor commented May 31, 2025 •

edited

Loading

eduarcor commented Jun 2, 2025 •

edited

Loading

eduarcor commented Jun 3, 2025 •

edited

Loading

eduarcor commented Jun 11, 2025 •

edited

Loading