| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
net: ibm: emac: Fix use-after-free during device removal
The driver was using devm_register_netdev() which causes unregister_netdev()
to be deferred until the devres cleanup phase, which runs after emac_remove()
returns. This creates a use-after-free window where:
1. emac_remove() is called, which tears down hardware (cancels work, detaches
modules, unregisters from MAL)
2. emac_remove() returns
3. devres cleanup runs and finally calls unregister_netdev()
During step 3, the network stack might still process packets, triggering
emac_irq(), emac_poll(), or other handlers that access now-freed hardware
resources (dev->emacp, dev->mal, etc.).
Fix this by replacing devm_register_netdev() with manual register_netdev()
and calling unregister_netdev() at the beginning of emac_remove(), before
any hardware teardown. This ensures the network device is fully stopped and
unregistered before hardware resources are released.
The change is safe because:
- dev->ndev is assigned very early in probe (before any error paths that
could bypass emac_remove)
- platform_set_drvdata() is only called after successful registration, so
emac_remove() only runs for fully registered devices
- unregister_netdev() is idempotent and safe to call on any registered device |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: avoid reading already updated pages during GC
We found the following issue during fuzz testing:
page: refcount:3 mapcount:0 mapping:00000000b6e89c65 index:0x18b2dc pfn:0x161ba9
memcg:f8ffff800e269c00
aops:f2fs_meta_aops ino:2
flags: 0x52880000000080a9(locked|waiters|uptodate|lru|private|zone=1|kasantag=0x4a)
raw: 52880000000080a9 fffffffec6e17588 fffffffec0ccc088 a7ffff8067063618
raw: 000000000018b2dc 0000000000000009 00000003ffffffff f8ffff800e269c00
page dumped because: VM_BUG_ON_FOLIO(folio_test_uptodate(folio))
page_owner tracks the page as allocated
post_alloc_hook+0x58c/0x5ec
prep_new_page+0x34/0x284
get_page_from_freelist+0x2dcc/0x2e8c
__alloc_pages_noprof+0x280/0x76c
__folio_alloc_noprof+0x18/0xac
__filemap_get_folio+0x6bc/0xdc4
pagecache_get_page+0x3c/0x104
do_garbage_collect+0x5c78/0x77a4
f2fs_gc+0xd74/0x25f0
gc_thread_func+0xb28/0x2930
kthread+0x464/0x5d8
ret_from_fork+0x10/0x20
------------[ cut here ]------------
kernel BUG at mm/filemap.c:1563!
folio_end_read+0x140/0x168
f2fs_finish_read_bio+0x5c4/0xb80
f2fs_read_end_io+0x64c/0x708
bio_endio+0x85c/0x8c0
blk_update_request+0x690/0x127c
scsi_end_request+0x9c/0xb8c
scsi_io_completion+0xf0/0x250
scsi_finish_command+0x430/0x45c
scsi_complete+0x178/0x6d4
blk_mq_complete_request+0xcc/0x104
scsi_done_internal+0x214/0x454
scsi_done+0x24/0x34
which is similar to the problem reported by syzbot:
https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
This case is consistent with the description in commit 9bf1a3f
("f2fs: avoid GC causing encrypted file corrupted"):
Page 1 is moved from blkaddr A to blkaddr B by move_data_block, and after
being written it is marked as uptodate. Then, Page 1 is moved from blkaddr
B to blkaddr C, VM_BUG_ON_FOLIO was triggered in the endio initiated by
ra_data_block.
There is no need to read Page 1 again from blkaddr B, since it has already
been updated. Therefore, avoid initiating I/O in this case. |
| In the Linux kernel, the following vulnerability has been resolved:
dm cache: fix dirty mapping checking in passthrough mode switching
As mentioned in commit 9b1cc9f251af ("dm cache: share cache-metadata
object across inactive and active DM tables"), dm-cache assumed table
reload occurs after suspension, while LVM's table preload breaks this
assumption. The dirty mapping check for passthrough mode was designed
around this assumption and is performed during table creation, causing
the check to fail with preload while metadata updates are ongoing. This
risks loading dirty mappings into passthrough mode, resulting in data
loss.
Reproduce steps:
1. Create a writeback cache with zero migration_threshold to produce
dirty mappings
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writeback smq \
2 migration_threshold 0"
2. Preload a table in passthrough mode
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
3. Write to the first cache block to make it dirty
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
4. Resume the inactive table. Now it's possible to load the dirty block
into passthrough mode.
dmsetup resume cache
Fix by moving the checks to the preresume phase to support table
preloading. Also remove the unused function dm_cache_metadata_all_clean. |
| Dell Display and Peripheral Manager (DDPM Mac), versions prior to 2.3, contain a Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition') vulnerability. A low privileged attacker with local access could potentially exploit this vulnerability, leading to Elevation of Privileges. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: synproxy: add mutex to guard hook reference counting
As the synproxy infrastructure register netfilter hooks on-demand when a
user adds the first iptables target or nftables expression, if done
concurrently they can race each other.
Introduce a mutex to serialize the refcount control blocks access from
both frontends. While a per namespace mutex might be more efficient, it
is not needed for target/expression like SYNPROXY. |
| Ghost is a Node.js content management system. From 6.0.9 until 6.21.1, Ghost’s private-IP check for outbound HTTP requests could be bypassed via DNS rebinding, allowing an attacker to coerce the Ghost server into reaching hosts on internal networks through features that issue external fetches. This vulnerability is fixed in 6.21.1. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7996: fix use-after-free bugs in mt7996_mac_dump_work()
When the mt7996 pci chip is detaching, the mt7996_crash_data is
released in mt7996_coredump_unregister(). However, the work item
dump_work may still be running or pending, leading to UAF bugs
when the already freed crash_data is dereferenced again in
mt7996_mac_dump_work().
The race condition can occur as follows:
CPU 0 (removal path) | CPU 1 (workqueue)
mt7996_pci_remove() | mt7996_sys_recovery_set()
mt7996_unregister_device() | mt7996_reset()
mt7996_coredump_unregister() | queue_work()
vfree(dev->coredump.crash_data) | mt7996_mac_dump_work()
| crash_data-> // UAF
Fix this by ensuring dump_work is properly canceled before
the crash_data is deallocated. Add cancel_work_sync() in
mt7996_unregister_device() to synchronize with any pending
or executing dump work. |
| A vulnerability was found in systemd-coredump. This flaw allows an attacker to force a SUID process to crash and replace it with a non-SUID binary to access the original's privileged process coredump, allowing the attacker to read sensitive data, such as /etc/shadow content, loaded by the original process.
A SUID binary or process has a special type of permission, which allows the process to run with the file owner's permissions, regardless of the user executing the binary. This allows the process to access more restricted data than unprivileged users or processes would be able to. An attacker can leverage this flaw by forcing a SUID process to crash and force the Linux kernel to recycle the process PID before systemd-coredump can analyze the /proc/pid/auxv file. If the attacker wins the race condition, they gain access to the original's SUID process coredump file. They can read sensitive content loaded into memory by the original binary, affecting data confidentiality. |
| In the Linux kernel, the following vulnerability has been resolved:
s390/ap: use generic driver_override infrastructure
When the AP masks are updated via apmask_store() or aqmask_store(),
ap_bus_revise_bindings() is called after ap_attr_mutex has been
released.
This calls __ap_revise_reserved(), which accesses the driver_override
field without holding any lock, racing against a concurrent
driver_override_store() that may free the old string, resulting in a
potential UAF.
Fix this by using the driver-core driver_override infrastructure, which
protects all accesses with an internal spinlock.
Note that unlike most other buses, the AP bus does not check
driver_override in its match() callback; the override is checked in
ap_device_probe() and __ap_revise_reserved() instead.
Also note that we do not enable the driver_override feature of struct
bus_type, as AP - in contrast to most other buses - passes "" to
sysfs_emit() when the driver_override pointer is NULL. Thus, printing
"\n" instead of "(null)\n".
Additionally, AP has a custom counter that is modified in the
corresponding custom driver_override_store(). |
| A flaw was found in rsync. This vulnerability arises from a race condition during rsync's handling of symbolic links. Rsync's default behavior when encountering symbolic links is to skip them. If an attacker replaced a regular file with a symbolic link at the right time, it was possible to bypass the default behavior and traverse symbolic links. Depending on the privileges of the rsync process, an attacker could leak sensitive information, potentially leading to privilege escalation. |
| In the Linux kernel, the following vulnerability has been resolved:
s390/cio: use generic driver_override infrastructure
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1] |
| In the Linux kernel, the following vulnerability has been resolved:
PCI: use generic driver_override infrastructure
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1] |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: return VMA snapshot from task_vma iterator
Holding the per-VMA lock across the BPF program body creates a lock
ordering problem when helpers acquire locks that depend on mmap_lock:
vm_lock -> i_rwsem -> mmap_lock -> vm_lock
Snapshot the VMA under the per-VMA lock in _next() via memcpy(), then
drop the lock before returning. The BPF program accesses only the
snapshot.
The verifier only trusts vm_mm and vm_file pointers (see
BTF_TYPE_SAFE_TRUSTED_OR_NULL in verifier.c). vm_file is reference-
counted with get_file() under the lock and released via fput() on the
next iteration or in _destroy(). vm_mm is already correct because
lock_vma_under_rcu() verifies vma->vm_mm == mm. All other pointers
are left as-is by memcpy() since the verifier treats them as untrusted. |
| In the Linux kernel, the following vulnerability has been resolved:
ipc/shm: serialize orphan cleanup with shm_nattch updates
shm_destroy_orphaned() walks the shm idr under shm_ids(ns).rwsem, but that
does not serialize all fields tested by shm_may_destroy(). In particular,
shm_nattch is updated while holding shm_perm.lock, and attach paths can do
that without holding the rwsem.
Do not decide that an orphaned segment is unused before taking the object
lock. Move the shm_may_destroy() check under shm_perm.lock, matching the
other destroy paths, and unlock the segment when it no longer qualifies
for removal. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf, sockmap: Fix af_unix iter deadlock
bpf_iter_unix_seq_show() may deadlock when lock_sock_fast() takes the fast
path and the iter prog attempts to update a sockmap. Which ends up spinning
at sock_map_update_elem()'s bh_lock_sock():
WARNING: possible recursive locking detected
test_progs/1393 is trying to acquire lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: sock_map_update_elem+0xdb/0x1f0
but task is already holding lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(slock-AF_UNIX);
lock(slock-AF_UNIX);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by test_progs/1393:
#0: ffff88814b59c790 (&p->lock){+.+.}-{4:4}, at: bpf_seq_read+0x59/0x10d0
#1: ffff88811ec25fd8 (sk_lock-AF_UNIX){+.+.}-{0:0}, at: bpf_seq_read+0x42c/0x10d0
#2: ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
#3: ffffffff85a6a7c0 (rcu_read_lock){....}-{1:3}, at: bpf_iter_run_prog+0x51d/0xb00
Call Trace:
dump_stack_lvl+0x5d/0x80
print_deadlock_bug.cold+0xc0/0xce
__lock_acquire+0x130f/0x2590
lock_acquire+0x14e/0x2b0
_raw_spin_lock+0x30/0x40
sock_map_update_elem+0xdb/0x1f0
bpf_prog_2d0075e5d9b721cd_dump_unix+0x55/0x4f4
bpf_iter_run_prog+0x5b9/0xb00
bpf_iter_unix_seq_show+0x1f7/0x2e0
bpf_seq_read+0x42c/0x10d0
vfs_read+0x171/0xb20
ksys_read+0xff/0x200
do_syscall_64+0x6b/0x3a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e |
| In the Linux kernel, the following vulnerability has been resolved:
dm cache: fix null-deref with concurrent writes in passthrough mode
In passthrough mode, when dm-cache starts to invalidate a cache
entry and bio prison cell lock fails due to concurrent write to
the same cached block, mg->cell remains NULL. The error path in
invalidate_complete() attempts to unlock and free the cell
unconditionally, causing a NULL pointer dereference:
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 134 Comm: fio Not tainted 6.19.0-rc7 #3 PREEMPT
RIP: 0010:dm_cell_unlock_v2+0x3f/0x210
<snip>
Call Trace:
invalidate_complete+0xef/0x430
map_bio+0x130f/0x1a10
cache_map+0x320/0x6b0
__map_bio+0x458/0x510
dm_submit_bio+0x40e/0x16d0
__submit_bio+0x419/0x870
<snip>
Reproduce steps:
1. Create a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Promote the first data block into cache
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the first cached block concurrently
fio --filename=/dev/mapper/cache --name test --rw=randwrite --bs=4k \
--randrepeat=0 --direct=1 --numjobs=2 --size 64k
Fix by checking if mg->cell is valid before attempting to unlock it. |
| In the Linux kernel, the following vulnerability has been resolved:
futex: Prevent lockup in requeue-PI during signal/ timeout wakeup
During wait-requeue-pi (task A) and requeue-PI (task B) the following
race can happen:
Task A Task B
futex_wait_requeue_pi()
futex_setup_timer()
futex_do_wait()
futex_requeue()
CLASS(hb, hb1)(&key1);
CLASS(hb, hb2)(&key2);
*timeout*
futex_requeue_pi_wakeup_sync()
requeue_state = Q_REQUEUE_PI_IGNORE
*blocks on hb->lock*
futex_proxy_trylock_atomic()
futex_requeue_pi_prepare()
Q_REQUEUE_PI_IGNORE => -EAGAIN
double_unlock_hb(hb1, hb2)
*retry*
Task B acquires both hb locks and attempts to acquire the PI-lock of the
top most waiter (task B). Task A is leaving early due to a signal/
timeout and started removing itself from the queue. It updates its
requeue_state but can not remove it from the list because this requires
the hb lock which is owned by task B.
Usually task A is able to swoop the lock after task B unlocked it.
However if task B is of higher priority then task A may not be able to
wake up in time and acquire the lock before task B gets it again.
Especially on a UP system where A is never scheduled.
As a result task A blocks on the lock and task B busy loops, trying to
make progress but live locks the system instead. Tragic.
This can be fixed by removing the top most waiter from the list in this
case. This allows task B to grab the next top waiter (if any) in the
next iteration and make progress.
Remove the top most waiter if futex_requeue_pi_prepare() fails.
Let the waiter conditionally remove itself from the list in
handle_early_requeue_pi_wakeup(). |
| In the Linux kernel, the following vulnerability has been resolved:
drbd: Balance RCU calls in drbd_adm_dump_devices()
Make drbd_adm_dump_devices() call rcu_read_lock() before
rcu_read_unlock() is called. This has been detected by the Clang
thread-safety analyzer. |
| In the Linux kernel, the following vulnerability has been resolved:
ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob size
The generic/642 test-case can reproduce the kernel crash:
[40243.605254] ------------[ cut here ]------------
[40243.605956] kernel BUG at fs/ceph/xattr.c:918!
[40243.607142] Oops: invalid opcode: 0000 [#1] SMP PTI
[40243.608067] CPU: 7 UID: 0 PID: 498762 Comm: kworker/7:1 Not tainted 7.0.0-rc7+ #3 PREEMPT(full)
[40243.609700] Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[40243.611820] Workqueue: ceph-msgr ceph_con_workfn
[40243.612715] RIP: 0010:__ceph_build_xattrs_blob+0x1b8/0x1e0
[40243.613731] Code: 0f 84 82 fe ff ff e9 cf 8e 56 ff 48 8d 65 e8 31 c0 5b 41 5c 41 5d 5d 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 c3 cc cc cc cc <0f> 0b 4c 8b 62 08 41 8b 85 24 07 00 00 49 83 c4 04 41 89 44 24 fc
[40243.616888] RSP: 0018:ffffcc80c4d4b688 EFLAGS: 00010287
[40243.617773] RAX: 0000000000010026 RBX: 0000000000000001 RCX: 0000000000000000
[40243.618928] RDX: ffff8a773798dee0 RSI: 0000000000000000 RDI: 0000000000000000
[40243.620158] RBP: ffffcc80c4d4b6a0 R08: 0000000000000000 R09: 0000000000000000
[40243.621573] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a75f3b58000
[40243.622907] R13: ffff8a75f3b58000 R14: 0000000000000080 R15: 000000000000bffd
[40243.624054] FS: 0000000000000000(0000) GS:ffff8a787d1b4000(0000) knlGS:0000000000000000
[40243.625331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40243.626269] CR2: 000072f390b623c0 CR3: 000000011c02a003 CR4: 0000000000372ef0
[40243.627408] Call Trace:
[40243.627839] <TASK>
[40243.628188] __prep_cap+0x3fd/0x4a0
[40243.628789] ? do_raw_spin_unlock+0x4e/0xe0
[40243.629474] ceph_check_caps+0x46a/0xc80
[40243.630094] ? __lock_acquire+0x4a2/0x2650
[40243.630773] ? find_held_lock+0x31/0x90
[40243.631347] ? handle_cap_grant+0x79f/0x1060
[40243.632068] ? lock_release+0xd9/0x300
[40243.632696] ? __mutex_unlock_slowpath+0x3e/0x340
[40243.633429] ? lock_release+0xd9/0x300
[40243.634052] handle_cap_grant+0xcf6/0x1060
[40243.634745] ceph_handle_caps+0x122b/0x2110
[40243.635415] mds_dispatch+0x5bd/0x2160
[40243.636034] ? ceph_con_process_message+0x65/0x190
[40243.636828] ? lock_release+0xd9/0x300
[40243.637431] ceph_con_process_message+0x7a/0x190
[40243.638184] ? kfree+0x311/0x4f0
[40243.638749] ? kfree+0x311/0x4f0
[40243.639268] process_message+0x16/0x1a0
[40243.639915] ? sg_free_table+0x39/0x90
[40243.640572] ceph_con_v2_try_read+0xf58/0x2120
[40243.641255] ? lock_acquire+0xc8/0x300
[40243.641863] ceph_con_workfn+0x151/0x820
[40243.642493] process_one_work+0x22f/0x630
[40243.643093] ? process_one_work+0x254/0x630
[40243.643770] worker_thread+0x1e2/0x400
[40243.644332] ? __pfx_worker_thread+0x10/0x10
[40243.645020] kthread+0x109/0x140
[40243.645560] ? __pfx_kthread+0x10/0x10
[40243.646125] ret_from_fork+0x3f8/0x480
[40243.646752] ? __pfx_kthread+0x10/0x10
[40243.647316] ? __pfx_kthread+0x10/0x10
[40243.647919] ret_from_fork_asm+0x1a/0x30
[40243.648556] </TASK>
[40243.648902] Modules linked in: overlay hctr2 libpolyval chacha libchacha adiantum libnh libpoly1305 essiv intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit kvm_intel kvm irqbypass joydev ghash_clmulni_intel aesni_intel rapl input_leds mac_hid psmouse vga16fb serio_raw vgastate floppy i2c_piix4 pata_acpi bochs qemu_fw_cfg i2c_smbus sch_fq_codel rbd dm_crypt msr parport_pc ppdev lp parport efi_pstore
[40243.654766] ---[ end trace 0000000000000000 ]---
Commit d93231a6bc8a ("ceph: prevent a client from exceeding the MDS
maximum xattr size") moved the required_blob_size computation to before
the __build_xattrs() call, introducing a race.
__build_xattrs() releases and reacquires i_ceph_lock during execution.
In that window, handle_cap_grant() may update i_xattrs.blob with a
newer MDS-provided blob and bump i_xattrs.version. When
__bui
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock between reflink and transaction commit when using flushoncommit
When using the flushoncommit mount option, we can have a deadlock between
a transaction commit and a reflink operation that copied an inline extent
to an offset beyond the current i_size of the destination node.
The deadlock happens like this:
1) Task A clones an inline extent from inode X to an offset of inode Y
that is beyond Y's current i_size. This means we copied the inline
extent's data to a folio of inode Y that is beyond its EOF, using a
call to copy_inline_to_page();
2) Task B starts a transaction commit and calls
btrfs_start_delalloc_flush() to flush delalloc;
3) The delalloc flushing sees the new dirty folio of inode Y and when it
attempts to flush it, it ends up at extent_writepage() and sees that
the offset of the folio is beyond the i_size of inode Y, so it attempts
to invalidate the folio by calling folio_invalidate(), which ends up at
btrfs' folio invalidate callback - btrfs_invalidate_folio(). There it
tries to lock the folio's range in inode Y's extent io tree, but it
blocks since it's currently locked by task A - during a reflink we lock
the inodes and the source and destination ranges after flushing all
delalloc and waiting for ordered extent completion - after that we
don't expect to have dirty folios in the ranges, the exception is if
we have to copy an inline extent's data (because the destination offset
is not zero);
4) Task A then attempts to start a transaction to update the inode item,
and then it's blocked since the current transaction is in the
TRANS_STATE_COMMIT_START state. Therefore task A has to wait for the
current transaction to become unblocked (its state >=
TRANS_STATE_UNBLOCKED).
So task A is waiting for the transaction commit done by task B, and
the later waiting on the extent lock of inode Y that is currently
held by task A.
Syzbot recently reported this with the following stack traces:
INFO: task kworker/u8:7:1053 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u8:7 state:D stack:23520 pid:1053 tgid:1053 ppid:2 task_flags:0x4208060 flags:0x00080000
Workqueue: writeback wb_workfn (flush-btrfs-46)
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wait_extent_bit fs/btrfs/extent-io-tree.c:811 [inline]
btrfs_lock_extent_bits+0x59c/0x700 fs/btrfs/extent-io-tree.c:1914
btrfs_lock_extent fs/btrfs/extent-io-tree.h:152 [inline]
btrfs_invalidate_folio+0x43d/0xc40 fs/btrfs/inode.c:7704
extent_writepage fs/btrfs/extent_io.c:1852 [inline]
extent_write_cache_pages fs/btrfs/extent_io.c:2580 [inline]
btrfs_writepages+0x12ff/0x2440 fs/btrfs/extent_io.c:2713
do_writepages+0x32e/0x550 mm/page-writeback.c:2554
__writeback_single_inode+0x133/0x11a0 fs/fs-writeback.c:1750
writeback_sb_inodes+0x995/0x19d0 fs/fs-writeback.c:2042
wb_writeback+0x456/0xb70 fs/fs-writeback.c:2227
wb_do_writeback fs/fs-writeback.c:2374 [inline]
wb_workfn+0x41a/0xf60 fs/fs-writeback.c:2414
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
INFO: task syz.4.64:6910 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.4.64 state:D stack:22752 pid:6910 tgid:
---truncated--- |