https://www.reddit.com/r/zfs/comments/1nkty6f/vestigial_pool_with_real_pools_device_as_a_member/
I've solved this and thought I'd leave my working here in case it's of use to anyone. It was with the help of ChatGPT which helped me with a crucial realisation. (That in itself was a learning experience - it appeared to show real understanding of zfs.)
As it turned out both disks had stale zfs labels on them from previous experiments I'd done, using the whole disk as the device. The reason they were left around is that the active pool is on the the first (and only) partition, which is aligned differently.
- The partition starts at sector 2048 (1MB into the device) - as is standard.
- It ends at the 12TB boundary, ~130MB before the end of these particular devices (WD Red). I sized it that way so that other 12TB disks would be attachable to the vdev in the future even if their exact size varied slightly.
Here's the partition layout:
root@nas # sgdisk -p /dev/sdc
Disk /dev/sdc: 23437770752 sectors, 10.9 TiB
Model: WDC WD120EFAX-68
Number Start (sector) End (sector) Size Code Name
1 2048 23437498368 10.9 TiB 8300
Since the zfs labels are stored in the first and last 512kB of the device, the ones from the whole-disk device are outside the partition, and 3 of the 4 were intact (the first was overwritten by the partition table). Here's truncated zdb output showing labels from three separate old pools from experiments across Linux and FreeBSD TrueNAS:
root@nas # zdb -l /dev/sdc
failed to unpack label 0
------------------------------------
LABEL 1
------------------------------------
name: 'sas'
pool_guid: 11932599429703228684
vdev_tree:
type: 'disk'
path: '/dev/da1'
labels = 1 3
------------------------------------
LABEL 2 (Bad label cksum)
------------------------------------
name: 'sas'
pool_guid: 3011242590297065095
vdev_tree:
type: 'mirror'
children[0]:
type: 'disk'
path: '/dev/sdc1'
children[1]:
type: 'disk'
path: '/dev/sdd1'
labels = 2
root@nas # zdb -l /dev/sdd
failed to unpack label 0
------------------------------------
LABEL 1
------------------------------------
name: 'sas'
pool_guid: 2926227388212695137
vdev_tree:
type: 'mirror'
children[0]:
type: 'disk'
path: '/dev/da0'
children[1]:
type: 'disk'
path: '/dev/da1'
labels = 1 3
------------------------------------
LABEL 2 (Bad label cksum)
------------------------------------
name: 'sas'
pool_guid: 3011242590297065095
vdev_tree:
type: 'mirror'
children[0]:
type: 'disk'
path: '/dev/sdc1'
children[1]:
type: 'disk'
path: '/dev/sdd1'
labels = 2
And for comparison, the labels for the active pool:
root@nas # zdb -l /dev/sdc1
------------------------------------
LABEL 0
------------------------------------
name: 'sas'
pool_guid: 10286991352931977429
vdev_tree:
type: 'mirror'
children[0]:
type: 'disk'
path: '/dev/sdc1
children[1]:
type: 'disk'
path: '/dev/sdd1
labels = 0 1 2 3
root@nas # zdb -l /dev/sdd1
------------------------------------
LABEL 0
------------------------------------
name: 'sas'
pool_guid: 10286991352931977429
vdev_tree:
type: 'mirror'
children[0]:
type: 'disk'
path: '/dev/sdc1
children[1]:
type: 'disk'
path: '/dev/sdd1
labels = 0 1 2 3
Here are the stale labels as detected by wipefs (along with the partition table and boot record). This was a good cross-check since the UUIDs match the zdb output above.
root@nas # wipefs -n /dev/sdc
DEVICE OFFSET TYPE UUID LABEL
sdc 0x44000 zfs_member 11932599429703228684 sas
sdc 0xae9fff84000 zfs_member 3011242590297065095 sas
sdc 0xae9fffc4000 zfs_member 11932599429703228684 sas
sdc 0x200 gpt
sdc 0xae9fffffe00 gpt
sdc 0x1fe PMBR
root@nas # wipefs -n /dev/sdd
DEVICE OFFSET TYPE UUID LABEL
sdd 0x44000 zfs_member 2926227388212695137 sas
sdd 0xae9fff84000 zfs_member 3011242590297065095 sas
sdd 0xae9fffc4000 zfs_member 2926227388212695137 sas
sdd 0x200 gpt
sdd 0xae9fffffe00 gpt
sdd 0x1fe PMBR
And the active pool's labels, which are completely separate:
root@nas # wipefs -n /dev/sdc1
DEVICE OFFSET TYPE UUID LABEL
sdc1 0x4000 zfs_member 10286991352931977429 sas
sdc1 0x44000 zfs_member 10286991352931977429 sas
sdc1 0xae9f7984000 zfs_member 10286991352931977429 sas
sdc1 0xae9f79c4000 zfs_member 10286991352931977429 sas
root@nas # wipefs -n /dev/sdd1
DEVICE OFFSET TYPE UUID LABEL
sdd1 0x4000 zfs_member 10286991352931977429 sas
sdd1 0x44000 zfs_member 10286991352931977429 sas
sdd1 0xae9f7984000 zfs_member 10286991352931977429 sas
sdd1 0xae9f79c4000 zfs_member 10286991352931977429 sas
At that point I had enough data to wipe the old labels safely. First I backed up the first and last 1MB of the disks just in case:
root@nas # dd if=/dev/sdc bs=1M count=1 of=./sdc-start
root@nas # dd if=/dev/sdd bs=1M count=1 of=./sdd-start
root@nas # dd if=/dev/sdc bs=1M count=1 of=./sdc-end skip=$(($(blockdev --getsize64 /dev/sdc) / 1048576 - 1))
root@nas # dd if=/dev/sdd bs=1M count=1 of=./sdd-end skip=$(($(blockdev --getsize64 /dev/sdc) / 1048576 - 1))
Then I dry-ran a removal:
root@nas # wipefs -o 0xae9fff84000 -n /dev/sdc
/dev/sdc: 16 bytes were erased at offset 0xae9fff84000 (zfs_member): 01 01 00 00 00 00 00 00 00 00 00 01 00 00 00 24
And then wiped each label by offset:
root@nas # wipefs -o 0x44000 /dev/sdc
root@nas # wipefs -o 0xae9fff84000 /dev/sdc
root@nas # wipefs -o 0xae9fffc4000 /dev/sdc
After wiping one disk I confirmed the pool was still importable and healthy before continuing:
root@nas # zpool import sas
root@nas # zpool status
root@nas # zpool export sas
And then repeated for the other disk.
root@nas # wipefs -o 0x44000 /dev/sdd
root@nas # wipefs -o 0xae9fff84000 /dev/sdd
root@nas # wipefs -o 0xae9fffc4000 /dev/sdd
Once that was all done I ran the original import -d that led me to discover one of the stale labels to confirm it only saw the expected pools. It also runs fast now without the 20 second wait I was seeing earlier.
root@nas # zpool import -d /dev
Finally I re-imported the active pool and scrubbed it for good measure, which succeeded. I figure if the pool can be scrubbed without any data errors then it's certain that my fix hasn't interfered with it.
root@nas # zpool import sas
root@nas # zpool scrub sas