| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
drm/gem: Try to fix change_handle ioctl, attempt 4
[airlied: just added some comments on how to reenable]
On-list because the cat is out of the bag and we're clearly not good
enough to figure this out in private. The story thus far:
5e28b7b94408 ("drm: Set old handle to NULL before prime swap in
change_handle") tried to fix a race condition between the gem_close and
gem_change_handle ioctls, but got a few things wrong:
- There's a confusion with the local variable handle, which is actually
the new handle, and so the two-stage trick was actually applied to the
wrong idr slot. 7164d78559b0 ("drm/gem: fix race between
change_handle and handle_delete") tried to fix that by adding yet
another code block, but forgot to add the error handling. Which meant
we now have two paths, both kinda wrong.
- dc366607c41c ("drm: Replace old pointer to new idr") tried to apply
another fix, but inconsistently, again because of the handle confusion
- this would be the right fix (kinda, somewhat, it's a mess) if we'd
do the two-stage approach for the new handle. Except that wasn't the
intent of the original fix.
We also didn't have an igt merged for the original ioctl, which is a big
no-go. This was attempted to address off-list in the original bugfix,
and amd QA people claimed the bug was fixed now. Very clearly that's not
the case. Here's my attempt to sort this out:
- Rename the local variable to new_handle, the old aliasing with
args->handle is just too dangerously confusing.
- Merge the gem obj lookup with the two-stage idr_replace so that we
avoid getting ourselves confused there.
- This means we don't have a surplus temporary reference anymore, only
an inherited from the idr. A concurrent gem_close on the new_handle
could steal that. Fix that with the same two-stage approach
create_tail uses. This is a bit overkill as documented in the comment,
but I also don't trust my ability to understand this all correctly, so
go with the established pattern we have from other ioctls instead for
maximum paranoia.
- Adjust error paths. I've tried to make the error and success paths
common, because they are identical except for which handle is removed
and on which we call idr_replace to (re)install the object again. But
that made things messier to read, so I've left it at the more verbose
version, which unfortunately hides the symmetry in the entire code
flow a bit.
- While at it, also replace the 7 space indent with 1 tab.
And finally, because I flat out don't trust my abilities here at all
anymore:
- Disable the ioctl until we have the igt situation and everything else
sorted out on-list and with full consensus.
v2:
Sashiko noticed that I didn't handle the error path for idr_replace
correctly, it must be checked with IS_ERR_OR_NULL like in
gem_handle_delete. So yeah, definitely should just the existing paths
1:1 because this is endless amounts of tricky.
Also add the Fixes: line for the original ioctl, I forgot that too. |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/umem: Fix truncation for block sizes >= 4G
When the iommu is used the linearization of the mapping can give a single
block that is very large split across multiple SG entries.
When __rdma_block_iter_next() reassembles the split SG entries it is
overflowing the 32 bit stack values and computed the wrong DMA addresses
for blocks after the truncation.
Use the right types to hold DMA addresses. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: require Ethernet MAC header before using eth_hdr()
`ip6t_eui64`, `xt_mac`, the `bitmap:ip,mac`, `hash:ip,mac`, and
`hash:mac` ipset types, and `nf_log_syslog` access `eth_hdr(skb)`
after either assuming that the skb is associated with an Ethernet
device or checking only that the `ETH_HLEN` bytes at
`skb_mac_header(skb)` lie between `skb->head` and `skb->data`.
Make these paths first verify that the skb is associated with an
Ethernet device, that the MAC header was set, and that it spans at
least a full Ethernet header before accessing `eth_hdr(skb)`. |
| In the Linux kernel, the following vulnerability has been resolved:
s390/bpf: Zero-extend bpf prog return values and kfunc arguments
s390x ABI requires callers to zero-extend unsigned arguments and
sign-extend signed arguments, and callees to zero-extend unsigned
return values and sign-extend signed return values.
s390 BPF JIT currently implements only sign extension. Fix this
omission and implement zero extension too. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Use RCU-safe iteration in dev_map_redirect_multi() SKB path
The DEVMAP_HASH branch in dev_map_redirect_multi() uses
hlist_for_each_entry_safe() to iterate hash buckets, but this function
runs under RCU protection (called from xdp_do_generic_redirect_map()
in softirq context). Concurrent writers (__dev_map_hash_update_elem,
dev_map_hash_delete_elem) modify the list using RCU primitives
(hlist_add_head_rcu, hlist_del_rcu).
hlist_for_each_entry_safe() performs plain pointer dereferences without
rcu_dereference(), missing the acquire barrier needed to pair with
writers' rcu_assign_pointer(). On weakly-ordered architectures (ARM64,
POWER), a reader can observe a partially-constructed node. It also
defeats CONFIG_PROVE_RCU lockdep validation and KCSAN data-race
detection.
Replace with hlist_for_each_entry_rcu() using rcu_read_lock_bh_held()
as the lockdep condition, consistent with the rcu_dereference_check()
used in the DEVMAP (non-hash) branch of the same functions. Also fix
the same incorrect lockdep_is_held(&dtab->index_lock) condition in
dev_map_enqueue_multi(), where the lock is not held either. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix stale offload->prog pointer after constant blinding
When a dev-bound-only BPF program (BPF_F_XDP_DEV_BOUND_ONLY) undergoes
JIT compilation with constant blinding enabled (bpf_jit_harden >= 2),
bpf_jit_blind_constants() clones the program. The original prog is then
freed in bpf_jit_prog_release_other(), which updates aux->prog to point
to the surviving clone, but fails to update offload->prog.
This leaves offload->prog pointing to the freed original program. When
the network namespace is subsequently destroyed, cleanup_net() triggers
bpf_dev_bound_netdev_unregister(), which iterates ondev->progs and calls
__bpf_prog_offload_destroy(offload->prog). Accessing the freed prog
causes a page fault:
BUG: unable to handle page fault for address: ffffc900085f1038
Workqueue: netns cleanup_net
RIP: 0010:__bpf_prog_offload_destroy+0xc/0x80
Call Trace:
__bpf_offload_dev_netdev_unregister+0x257/0x350
bpf_dev_bound_netdev_unregister+0x4a/0x90
unregister_netdevice_many_notify+0x2a2/0x660
...
cleanup_net+0x21a/0x320
The test sequence that triggers this reliably is:
1. Set net.core.bpf_jit_harden=2 (echo 2 > /proc/sys/net/core/bpf_jit_harden)
2. Run xdp_metadata selftest, which creates a dev-bound-only XDP
program on a veth inside a netns (./test_progs -t xdp_metadata)
3. cleanup_net -> page fault in __bpf_prog_offload_destroy
Dev-bound-only programs are unique in that they have an offload structure
but go through the normal JIT path instead of bpf_prog_offload_compile().
This means they are subject to constant blinding's prog clone-and-replace,
while also having offload->prog that must stay in sync.
Fix this by updating offload->prog in bpf_jit_prog_release_other(),
alongside the existing aux->prog update. Both are back-pointers to
the prog that must be kept in sync when the prog is replaced. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix linked reg delta tracking when src_reg == dst_reg
Consider the case of rX += rX where src_reg and dst_reg are pointers to
the same bpf_reg_state in adjust_reg_min_max_vals(). The latter first
modifies the dst_reg in-place, and later in the delta tracking, the
subsequent is_reg_const(src_reg)/reg_const_value(src_reg) reads the
post-{add,sub} value instead of the original source.
This is problematic since it sets an incorrect delta, which sync_linked_regs()
then propagates to linked registers, thus creating a verifier-vs-runtime
mismatch. Fix it by just skipping this corner case. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix ld_{abs,ind} failure path analysis in subprogs
Usage of ld_{abs,ind} instructions got extended into subprogs some time
ago via commit 09b28d76eac4 ("bpf: Add abnormal return checks."). These
are only allowed in subprograms when the latter are BTF annotated and
have scalar return types.
The code generator in bpf_gen_ld_abs() has an abnormal exit path (r0=0 +
exit) from legacy cBPF times. While the enforcement is on scalar return
types, the verifier must also simulate the path of abnormal exit if the
packet data load via ld_{abs,ind} failed.
This is currently not the case. Fix it by having the verifier simulate
both success and failure paths, and extend it in similar ways as we do
for tail calls. The success path (r0=unknown, continue to next insn) is
pushed onto stack for later validation and the r0=0 and return to the
caller is done on the fall-through side. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bcmgenet: fix leaking free_bds
While reclaiming the tx queue we fast forward the write pointer to
drop any data in flight. These dropped frames are not added back
to the pool of free bds. We also need to tell the netdev that we
are dropping said data. |
| In the Linux kernel, the following vulnerability has been resolved:
net: bcmgenet: fix racing timeout handler
The bcmgenet_timeout handler tries to take down all tx queues when
a single queue times out. This is over zealous and causes many race
conditions with queues that are still chugging along. Instead lets
only restart the timed out queue. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: fix mm lifecycle in open-coded task_vma iterator
The open-coded task_vma iterator reads task->mm locklessly and acquires
mmap_read_trylock() but never calls mmget(). If the task exits
concurrently, the mm_struct can be freed as it is not
SLAB_TYPESAFE_BY_RCU, resulting in a use-after-free.
Safely read task->mm with a trylock on alloc_lock and acquire an mm
reference. Drop the reference via bpf_iter_mmput_async() in _destroy()
and error paths. bpf_iter_mmput_async() is a local wrapper around
mmput_async() with a fallback to mmput() on !CONFIG_MMU.
Reject irqs-disabled contexts (including NMI) up front. Operations used
by _next() and _destroy() (mmap_read_unlock, bpf_iter_mmput_async)
take spinlocks with IRQs disabled (pool->lock, pi_lock). Running from
NMI or from a tracepoint that fires with those locks held could
deadlock.
A trylock on alloc_lock is used instead of the blocking task_lock()
(get_task_mm) to avoid a deadlock when a softirq BPF program iterates
a task that already holds its alloc_lock on the same CPU. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars
When regsafe() compares two scalar registers that both carry
BPF_ADD_CONST, check_scalar_ids() maps their full compound id
(aka base | BPF_ADD_CONST flag) as one idmap entry. However,
it never verifies that the underlying base ids, that is, with
the flag stripped are consistent with existing idmap mappings.
This allows construction of two verifier states where the old
state has R3 = R2 + 10 (both sharing base id A) while the current
state has R3 = R4 + 10 (base id C, unrelated to R2). The idmap
creates two independent entries: A->B (for R2) and A|flag->C|flag
(for R3), without catching that A->C conflicts with A->B. State
pruning then incorrectly succeeds.
Fix this by additionally verifying base ID mapping consistency
whenever BPF_ADD_CONST is set: after mapping the compound ids,
also invoke check_ids() on the base IDs (flag bits stripped).
This ensures that if A was already mapped to B from comparing
the source register, any ADD_CONST derivative must also derive
from B, not an unrelated C. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix same-register dst/src OOB read and pointer leak in sock_ops
When a BPF sock_ops program accesses ctx fields with dst_reg == src_reg,
the SOCK_OPS_GET_SK() and SOCK_OPS_GET_FIELD() macros fail to zero the
destination register in the !fullsock / !locked_tcp_sock path.
Both macros borrow a temporary register to check is_fullsock /
is_locked_tcp_sock when dst_reg == src_reg, because dst_reg holds the
ctx pointer. When the check is false (e.g., TCP_NEW_SYN_RECV state with
a request_sock), dst_reg should be zeroed but is not, leaving the stale
ctx pointer:
- SOCK_OPS_GET_SK: dst_reg retains the ctx pointer, passes NULL checks
as PTR_TO_SOCKET_OR_NULL, and can be used as a bogus socket pointer,
leading to stack-out-of-bounds access in helpers like
bpf_skc_to_tcp6_sock().
- SOCK_OPS_GET_FIELD: dst_reg retains the ctx pointer which the
verifier believes is a SCALAR_VALUE, leaking a kernel pointer.
Fix both macros by:
- Changing JMP_A(1) to JMP_A(2) in the fullsock path to skip the
added instruction.
- Adding BPF_MOV64_IMM(si->dst_reg, 0) after the temp register
restore in the !fullsock path, placed after the restore because
dst_reg == src_reg means we need src_reg intact to read ctx->temp. |
| In the Linux kernel, the following vulnerability has been resolved:
net/rds: Restrict use of RDS/IB to the initial network namespace
Prevent using RDS/IB in network namespaces other than the initial one.
The existing RDS/IB code will not work properly in non-initial network
namespaces. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix OOB in pcpu_init_value
An out-of-bounds read occurs when copying element from a
BPF_MAP_TYPE_CGROUP_STORAGE map to another pcpu map with the
same value_size that is not rounded up to 8 bytes.
The issue happens when:
1. A CGROUP_STORAGE map is created with value_size not aligned to
8 bytes (e.g., 4 bytes)
2. A pcpu map is created with the same value_size (e.g., 4 bytes)
3. Update element in 2 with data in 1
pcpu_init_value assumes that all sources are rounded up to 8 bytes,
and invokes copy_map_value_long to make a data copy, However, the
assumption doesn't stand since there are some cases where the source
may not be rounded up to 8 bytes, e.g., CGROUP_STORAGE, skb->data.
the verifier verifies exactly the size that the source claims, not
the size rounded up to 8 bytes by kernel, an OOB happens when the
source has only 4 bytes while the copy size(4) is rounded up to 8. |
| In the Linux kernel, the following vulnerability has been resolved:
ppp: require CAP_NET_ADMIN in target netns for unattached ioctls
/dev/ppp open is currently authorized against file->f_cred->user_ns,
while unattached administrative ioctls operate on current->nsproxy->net_ns.
As a result, a local unprivileged user can create a new user namespace
with CLONE_NEWUSER, gain CAP_NET_ADMIN only in that new user namespace,
and still issue PPPIOCNEWUNIT, PPPIOCATTACH, or PPPIOCATTCHAN against
an inherited network namespace.
Require CAP_NET_ADMIN in the user namespace that owns the target network
namespace before handling unattached PPP administrative ioctls.
This preserves normal pppd operation in the network namespace it is
actually privileged in, while rejecting the userns-only inherited-netns
case. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
When protocol sets HCI_PROTO_DEFER, hci_conn_request_evt() calls
hci_connect_cfm(conn) without hdev->lock. Generally hci_connect_cfm()
assumes it is held, and if conn is deleted concurrently -> UAF.
Only SCO and ISO set HCI_PROTO_DEFER and only for defer setup listen,
and HCI_EV_CONN_REQUEST is not generated for ISO. In the non-deferred
listening socket code paths, hci_connect_cfm(conn) is called with
hdev->lock held.
Fix by holding the lock. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
l2cap_ecred_reconf_rsp() calls l2cap_chan_del() without holding
l2cap_chan_lock(). Every other l2cap_chan_del() caller in the file
acquires the lock first. A remote BLE device can send a crafted
L2CAP ECRED reconfiguration response to corrupt the channel list
while another thread is iterating it.
Add l2cap_chan_hold() and l2cap_chan_lock() before l2cap_chan_del(),
and l2cap_chan_unlock() and l2cap_chan_put() after, matching the
pattern used in l2cap_ecred_conn_rsp() and l2cap_conn_del(). |
| In the Linux kernel, the following vulnerability has been resolved:
sctp: disable BH before calling udp_tunnel_xmit_skb()
udp_tunnel_xmit_skb() / udp_tunnel6_xmit_skb() are expected to run with
BH disabled. After commit 6f1a9140ecda ("add xmit recursion limit to
tunnel xmit functions"), on the path:
udp(6)_tunnel_xmit_skb() -> ip(6)tunnel_xmit()
dev_xmit_recursion_inc()/dec() must stay balanced on the same CPU.
Without local_bh_disable(), the context may move between CPUs, which can
break the inc/dec pairing. This may lead to incorrect recursion level
detection and cause packets to be dropped in ip(6)_tunnel_xmit() or
__dev_queue_xmit().
Fix it by disabling BH around both IPv4 and IPv6 SCTP UDP xmit paths.
In my testing, after enabling the SCTP over UDP:
# ip net exec ha sysctl -w net.sctp.udp_port=9899
# ip net exec ha sysctl -w net.sctp.encap_port=9899
# ip net exec hb sysctl -w net.sctp.udp_port=9899
# ip net exec hb sysctl -w net.sctp.encap_port=9899
# ip net exec ha iperf3 -s
- without this patch:
# ip net exec hb iperf3 -c 192.168.0.1 --sctp
[ 5] 0.00-10.00 sec 37.2 MBytes 31.2 Mbits/sec sender
[ 5] 0.00-10.00 sec 37.1 MBytes 31.1 Mbits/sec receiver
- with this patch:
# ip net exec hb iperf3 -c 192.168.0.1 --sctp
[ 5] 0.00-10.00 sec 3.14 GBytes 2.69 Gbits/sec sender
[ 5] 0.00-10.00 sec 3.14 GBytes 2.69 Gbits/sec receiver |
| In the Linux kernel, the following vulnerability has been resolved:
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
syzkaller reported a kernel panic in bond_rr_gen_slave_id() reached via
xdp_master_redirect(). Full decoded trace:
https://syzkaller.appspot.com/bug?extid=80e046b8da2820b6ba73
bond_rr_gen_slave_id() dereferences bond->rr_tx_counter, a per-CPU
counter that bonding only allocates in bond_open() when the mode is
round-robin. If the bond device was never brought up, rr_tx_counter
stays NULL.
The XDP redirect path can still reach that code on a bond that was
never opened: bpf_master_redirect_enabled_key is a global static key,
so as soon as any bond device has native XDP attached, the
XDP_TX -> xdp_master_redirect() interception is enabled for every
slave system-wide. The path xdp_master_redirect() ->
bond_xdp_get_xmit_slave() -> bond_xdp_xmit_roundrobin_slave_get() ->
bond_rr_gen_slave_id() then runs against a bond that has no
rr_tx_counter and crashes.
Fix this in the generic xdp_master_redirect() by refusing to call into
the master's ->ndo_xdp_get_xmit_slave() when the master device is not
up. IFF_UP is only set after ->ndo_open() has successfully returned,
so this reliably excludes masters whose XDP state has not been fully
initialized. Drop the frame with XDP_ABORTED so the exception is
visible via trace_xdp_exception() rather than silently falling through.
This is not specific to bonding: any current or future master that
defers XDP state allocation to ->ndo_open() is protected. |