Hello from ZFS Town
Friday, 28 February 2025

Hello from ZFS Town! It’s Friday, 28 February 2025 and I’m your host, Rob Norris.

NOTE

hello

#17042 zpool: allow relative vdev paths

@tonyhutter 2025-02-10-2025-02-25 closed

Allow relative paths in zpool create.

#17054 suspend_resume_single: clear pool errors on fail

@robn 2025-02-14-2025-02-23 closed

While working on something I kept hitting this test failing. And yes, I had bugs, but it sucked that rather than failing the test the entire kernel would just hang up.

The upshot is that if the timing is unfortunate, the pool can suspend just as we’re failing because it didn’t suspend. If we don’t resume the pool, we hang trying to destroy it.

#17064 vdev_file: make FLUSH and TRIM asynchronous

@robn 2025-02-18-2025-02-22 closed

zfs_file_fsync() and zfs_file_deallocate() are both blocking ops, so the zio_taskq thread is active and blocked both while waiting for the IO call and then while calling zio_execute() for the next stage. This is a particular issue for FLUSH, as the z_flush_iss queue typically only has one thread; multiple flushes arriving at once can cause long delays if the underlying fsync() response is particularly slow.

#17066 include: move zio_priority_t into zfs.h

@robn 2025-02-18-2025-02-22 closed

I was asked about how to use libzfs_core.h directly in a new program. Turns out it’s a bit fiddly, which I’m trying to address. This is part of it.

#17071 arc: avoid possible deadlock in arc_read

@ixhamza 2025-02-19-2025-02-25 closed

In l2arc_evict(), the config lock may be acquired in reverse order (e.g., first the config lock (writer), then a hash lock) unlike in arc_read() during scenarios like L2ARC device removal. To avoid deadlocks, if the attempt to acquire the config lock (reader) fails in arc_read(), release the hash lock, wait for the config lock, and retry from the beginning.

#17078 Fix a crash when attempting to read a block pointer with no valid DVAs

@asomers 2025-02-20- open

Somehow ZFS wrote to my pool a block pointer that is neither embedded nor a hole, yet contains no DVAs with a non-zero asize. That should be impossible, but it happened. This PR will cause ZFS to return ECKSUM when attempting to read from that block pointer, rather than crashing. This PR will also fix zdb so it can display such block pointers. Finally, I believe that this PR will prevent such block pointers from being written to disk in the first place, by triggering a panic in zfs_blkptr_verify in the write path.

#17079 Fix wrong free function in arc_hdr_decrypt

@tuxoko 2025-02-21-2025-02-22 closed

Found this when making some other changes. Not sure if it’s possible to hit or not.

#17080 Don’t try to get mg of hole vdev in removal

@pcd1193182 2025-02-21-2025-02-25 closed

When doing device removal, we verify that the vdev being removed isn’t the last vdev in the normal class. To do that, we iterate over all the disks and checking their metaslab class until we find one in the normal group. Right now, if you have a hole vdev before the first allocatable normal class vdev, we kernel panic.

#17081 Better fill empty metaslabs

@amotin 2025-02-21-2025-02-25 closed

Before this change zfs_metaslab_switch_threshold tunable switched metaslabs each time ones index reduced by two (which means biggest contiguous chunk reduced to 1/4). It is a good idea to balance metaslabs fragmentation. But for empty metaslabs (having power-of-2 sizes) this means switching when they get just below the half of their capacity. Inspection with zdb after filling new pool to half capacity shown most of its metaslabs filled to half capacity. I consider this sub-optimal for pool fragmentation in a long run.

#17088 spa: fix signature mismatch for spa_boot_init as eventhandler required

@aokblast 2025-02-23-2025-02-25 closed

I am working on KCFI for FreeBSD. As KCFI required all function signature matched.

We should have this patch for changing signature.

#17089 ICP encryption tests

@robn 2025-02-24-2025-02-26 closed

I wanted to do a good review of #17058, but really had no idea how to without some way to compare the output with what we already have.

And then I went mad and ended up writing a whole correctness and performance test driver for the ICP. And then it found a real (though inconsequential) bug. Good times all round!

Thanks @lowjoel for motivation and rubber-ducking 👍

#17094 Add zfs_recover_ms parameter

@ihoro 2025-02-25- open

There are production cases when loading of a metaslab leads to a ZFS panic due to unexpected entries in its spacemap (presumably). The assertions in zfs_range_tree_add_impl() and zfs_range_tree_remove_impl() fail due to overlapping or missing segments, etc. A business would like to go ahead with such pools while the root cause is being investigated.

#17097 zfs-2.3.1 patchset

@ixhamza 2025-02-26- open

Initial proposed patchset for zfs-2.3.1.

#17098 Linux 6.13 compat: META

@tonyhutter 2025-02-27-2025-02-27 closed

Update the META file to reflect compatibility with the 6.13 kernel.

#17099 convert_wycheproof: two small fixes

@robn 2025-02-27- open

Fixing two bugs in convert_wycheproof.pl that snuck through in #17089.