In the Linux kernel, the following vulnerability has been resolved: net: skbuff: preserve shared-frag marker during coalescing skb_try_coalesce() can attach paged frags from @from to @to. If @from has SKBFL_SHARED_FRAG set, the resulting @to skb can contain the same externally-owned or page-cache-backed frags, but the shared-frag marker is currently lost. That breaks the invariant relied on by later in-place writers. In particular, ESP input checks skb_has_shared_frag() before deciding whether an uncloned nonlinear skb can skip skb_cow_data(). If TCP receive coalescing has moved shared frags into an unmarked skb, ESP can see skb_has_shared_frag() as false and decrypt in place over page-cache backed frags. Propagate SKBFL_SHARED_FRAG when skb_try_coalesce() transfers paged frags. The tailroom copy path does not need the marker because it copies bytes into @to's linear data rather than transferring frag descriptors.
In the Linux kernel, the following vulnerability has been resolved: rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present The DATA-packet handler in rxrpc_input_call_event() and the RESPONSE handler in rxrpc_verify_response() copy the skb to a linear one before calling into the security ops only when skb_cloned() is true. An skb that is not cloned but still carries externally-owned paged fragments (e.g. SKBFL_SHARED_FRAG set by splice() into a UDP socket via __ip_append_data, or a chained skb_has_frag_list()) falls through to the in-place decryption path, which binds the frag pages directly into the AEAD/skcipher SGL via skb_to_sgvec(). Extend the gate to also unshare when skb_has_frag_list() or skb_has_shared_frag() is true. This catches the splice-loopback vector and other externally-shared frag sources while preserving the zero-copy fast path for skbs whose frags are kernel-private (e.g. NIC page_pool RX, GRO). The OOM/trace handling already in place is reused.
In the Linux kernel, the following vulnerability has been resolved: xfrm: esp: avoid in-place decrypt on shared skb frags MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(), so later paths that may modify packet data can first make a private copy. The IPv4/IPv6 datagram append paths did not set this flag when splicing pages into UDP skbs. That leaves an ESP-in-UDP packet made from shared pipe pages looking like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW fast path for uncloned skbs without a frag_list and decrypts in place over data that is not owned privately by the skb. Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching TCP. Also make ESP input fall back to skb_cow_data() when the flag is present, so ESP does not decrypt externally backed frags in place. Private nonlinear skb frags still use the existing fast path. This intentionally does not change ESP output. In esp_output_head(), the path that appends the ESP trailer to existing skb tailroom without calling skb_cow_data() is not reachable for nonlinear skbs: skb_tailroom() returns zero when skb->data_len is nonzero, while ESP tailen is positive. Thus ESP output will either use the separate destination-frag path or fall back to skb_cow_data().
A use-after-free flaw was found in X.Org and Xwayland. When changing an alarm, the values of the change mask are evaluated one after the other, changing the trigger values as requested, and eventually, SyncInitTrigger() is called. If one of the changes triggers an error, the function will return early, not adding the new sync object, possibly causing a use-after-free when the alarm eventually triggers.
There is a possible tty hijacking in shadow 4.x before 4.1.5 and sudo 1.x before 1.7.4 via "su - user -c program". The user session can be escaped to the parent session by using the TIOCSTI ioctl to push characters into the input buffer to be read by the next process.
A potential security vulnerability has been identified in the HP Linux Imaging and Printing Software. This potential vulnerability may allow escalation of privileges and/or arbitrary code execution via operating system command injection.
In the Linux kernel, the following vulnerability has been resolved: gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp() The gtp_net_ops pernet operations structure for the subsystem must be registered before registering the generic netlink family. Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 PID: 5826 Comm: gtp Not tainted 6.8.0-rc3-std-def-alt1 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014 RIP: 0010:gtp_genl_dump_pdp+0x1be/0x800 [gtp] Code: c6 89 c6 e8 64 e9 86 df 58 45 85 f6 0f 85 4e 04 00 00 e8 c5 ee 86 df 48 8b 54 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 de 05 00 00 48 8b 44 24 18 4c 8b 30 4c 39 f0 74 RSP: 0018:ffff888014107220 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88800fcda588 R14: 0000000000000001 R15: 0000000000000000 FS: 00007f1be4eb05c0(0000) GS:ffff88806ce80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1be4e766cf CR3: 000000000c33e000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? show_regs+0x90/0xa0 ? die_addr+0x50/0xd0 ? exc_general_protection+0x148/0x220 ? asm_exc_general_protection+0x22/0x30 ? gtp_genl_dump_pdp+0x1be/0x800 [gtp] ? __alloc_skb+0x1dd/0x350 ? __pfx___alloc_skb+0x10/0x10 genl_dumpit+0x11d/0x230 netlink_dump+0x5b9/0xce0 ? lockdep_hardirqs_on_prepare+0x253/0x430 ? __pfx_netlink_dump+0x10/0x10 ? kasan_save_track+0x10/0x40 ? __kasan_kmalloc+0x9b/0xa0 ? genl_start+0x675/0x970 __netlink_dump_start+0x6fc/0x9f0 genl_family_rcv_msg_dumpit+0x1bb/0x2d0 ? __pfx_genl_family_rcv_msg_dumpit+0x10/0x10 ? genl_op_from_small+0x2a/0x440 ? cap_capable+0x1d0/0x240 ? __pfx_genl_start+0x10/0x10 ? __pfx_genl_dumpit+0x10/0x10 ? __pfx_genl_done+0x10/0x10 ? security_capable+0x9d/0xe0
In the Linux kernel, the following vulnerability has been resolved: xfs: do not propagate ENODATA disk errors into xattr code ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code; namely, that the requested attribute name could not be found. However, a medium error from disk may also return ENODATA. At best, this medium error may escape to userspace as "attribute not found" when in fact it's an IO (disk) error. At worst, we may oops in xfs_attr_leaf_get() when we do: error = xfs_attr_leaf_hasname(args, &bp); if (error == -ENOATTR) { xfs_trans_brelse(args->trans, bp); return error; } because an ENODATA/ENOATTR error from disk leaves us with a null bp, and the xfs_trans_brelse will then null-deref it. As discussed on the list, we really need to modify the lower level IO functions to trap all disk errors and ensure that we don't let unique errors like this leak up into higher xfs functions - many like this should be remapped to EIO. However, this patch directly addresses a reported bug in the xattr code, and should be safe to backport to stable kernels. A larger-scope patch to handle more unique errors at lower levels can follow later. (Note, prior to 07120f1abdff we did not oops, but we did return the wrong error code to userspace.)
In the Linux kernel, the following vulnerability has been resolved: llc: call sock_orphan() at release time syzbot reported an interesting trace [1] caused by a stale sk->sk_wq pointer in a closed llc socket. In commit ff7b11aa481f ("net: socket: set sock->sk to NULL after calling proto_ops::release()") Eric Biggers hinted that some protocols are missing a sock_orphan(), we need to perform a full audit. In net-next, I plan to clear sock->sk from sock_orphan() and amend Eric patch to add a warning. [1] BUG: KASAN: slab-use-after-free in list_empty include/linux/list.h:373 [inline] BUG: KASAN: slab-use-after-free in waitqueue_active include/linux/wait.h:127 [inline] BUG: KASAN: slab-use-after-free in sock_def_write_space_wfree net/core/sock.c:3384 [inline] BUG: KASAN: slab-use-after-free in sock_wfree+0x9a8/0x9d0 net/core/sock.c:2468 Read of size 8 at addr ffff88802f4fc880 by task ksoftirqd/1/27 CPU: 1 PID: 27 Comm: ksoftirqd/1 Not tainted 6.8.0-rc1-syzkaller-00049-g6098d87eaf31 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:377 [inline] print_report+0xc4/0x620 mm/kasan/report.c:488 kasan_report+0xda/0x110 mm/kasan/report.c:601 list_empty include/linux/list.h:373 [inline] waitqueue_active include/linux/wait.h:127 [inline] sock_def_write_space_wfree net/core/sock.c:3384 [inline] sock_wfree+0x9a8/0x9d0 net/core/sock.c:2468 skb_release_head_state+0xa3/0x2b0 net/core/skbuff.c:1080 skb_release_all net/core/skbuff.c:1092 [inline] napi_consume_skb+0x119/0x2b0 net/core/skbuff.c:1404 e1000_unmap_and_free_tx_resource+0x144/0x200 drivers/net/ethernet/intel/e1000/e1000_main.c:1970 e1000_clean_tx_irq drivers/net/ethernet/intel/e1000/e1000_main.c:3860 [inline] e1000_clean+0x4a1/0x26e0 drivers/net/ethernet/intel/e1000/e1000_main.c:3801 __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6576 napi_poll net/core/dev.c:6645 [inline] net_rx_action+0x956/0xe90 net/core/dev.c:6778 __do_softirq+0x21a/0x8de kernel/softirq.c:553 run_ksoftirqd kernel/softirq.c:921 [inline] run_ksoftirqd+0x31/0x60 kernel/softirq.c:913 smpboot_thread_fn+0x660/0xa10 kernel/smpboot.c:164 kthread+0x2c6/0x3a0 kernel/kthread.c:388 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 </TASK> Allocated by task 5167: kasan_save_stack+0x33/0x50 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:314 [inline] __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:340 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3813 [inline] slab_alloc_node mm/slub.c:3860 [inline] kmem_cache_alloc_lru+0x142/0x6f0 mm/slub.c:3879 alloc_inode_sb include/linux/fs.h:3019 [inline] sock_alloc_inode+0x25/0x1c0 net/socket.c:308 alloc_inode+0x5d/0x220 fs/inode.c:260 new_inode_pseudo+0x16/0x80 fs/inode.c:1005 sock_alloc+0x40/0x270 net/socket.c:634 __sock_create+0xbc/0x800 net/socket.c:1535 sock_create net/socket.c:1622 [inline] __sys_socket_create net/socket.c:1659 [inline] __sys_socket+0x14c/0x260 net/socket.c:1706 __do_sys_socket net/socket.c:1720 [inline] __se_sys_socket net/socket.c:1718 [inline] __x64_sys_socket+0x72/0xb0 net/socket.c:1718 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd3/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Freed by task 0: kasan_save_stack+0x33/0x50 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 kasan_save_free_info+0x3f/0x60 mm/kasan/generic.c:640 poison_slab_object mm/kasan/common.c:241 [inline] __kasan_slab_free+0x121/0x1b0 mm/kasan/common.c:257 kasan_slab_free include/linux/kasan.h:184 [inline] slab_free_hook mm/slub.c:2121 [inlin ---truncated---
In the Linux kernel, the following vulnerability has been resolved: ALSA: timer: Forcibly close timer instances at closing When snd_timer object is freed via snd_timer_free() and still pending snd_timer_instance objects are assigned to the timer object, it tries to unlink all instances and just set NULL to each ti->timer, then releases the resources immediately. The problem is, however, when there are slave timer instances that are associated with a master instance linked to this timer: namely, those slave instances still point to the freed timer object although the master instance is unlinked, which may lead to user-after-free. The bug can be easily triggered particularly when a new userspace-driven timers (CONFIG_SND_UTIMER) is involved, since it can create and delete the timer object via a simple file open/close, while the other applications may keep accessing to that timer. This patch is an attempt to paper over the problem above: now instead of just unlinking, call snd_timer_close[_locked]() forcibly for each pending timer instance, so that all assigned slave timer instances are properly detached, too. Since snd_timer_close() might be called later by the driver that created that instance, the check of SNDRV_TIMER_IFLG_DEAD is added at the beginning, too.
In the Linux kernel, the following vulnerability has been resolved: mm/huge_memory: update file PMD counter before folio_put() __split_huge_pmd_locked() updates the file/shmem RSS counter after dropping the PMD mapping's folio reference. If folio_put() drops the last reference, mm_counter_file() can later read freed folio state via folio_test_swapbacked(). Move the counter update before folio_put().
A vulnerability was found in libX11 due to an integer overflow within the XCreateImage() function. This flaw allows a local user to trigger an integer overflow and execute arbitrary code with elevated privileges.
A symlink following vulnerability was found in the ABRT post-create event handler scripts in libreport. Event scripts write output files using shell redirections without the O_NOFOLLOW flag. If the target file is replaced with a symlink, the shell process running as root follows the symlink and writes content to the symlink target, allowing arbitrary file overwrites on the system.
In the Linux kernel, the following vulnerability has been resolved: Revert "drm/xe: Skip exec queue schedule toggle if queue is idle during suspend" This reverts commit 8533051ce92015e9cc6f75e0d52119b9d91610b6. The idle-skip optimization bypasses GuC suspend, so the GPU may not perform the context switch that flushes TLB entries for invalidated userptr VMAs. In LR/preempt-fence VM mode, this can lead to missed TLB invalidation and page faults during userptr invalidation tests. Restore unconditional schedule toggling on suspend so the context-switch TLB flush is always performed. This optimization will be reintroduced with a fix that does not skip suspend in LR/preempt-fence VM mode. (cherry picked from commit 6a1e7934d9a6cf46aecae00a99c2603d1295e170)
In the Linux kernel, the following vulnerability has been resolved: mm/list_lru: drain before clearing xarray entry on reparent memcg_reparent_list_lrus() clears the dying memcg's xarray entry with xas_store(&xas, NULL) before reparenting its per-node lists into the parent. This opens a window where a concurrent list_lru_del() arriving for the dying memcg sees xa_load() == NULL, walks to the parent in lock_list_lru_of_memcg(), takes the parent's per-node lock, and calls list_del_init() on an item still physically linked on the dying memcg's list. If another in-flight thread holds the dying memcg's per-node lock at the same moment (another list_lru_del, or a list_lru_walk_one running an isolate callback), both threads modify ->next/->prev pointers on the same physical list under different locks. Adjacent items can corrupt each other's links. Fix it by reversing the order: reparent each per-node list and mark the child's list lru dead and then clear the xarray entry. Any concurrent list_lru op that finds the still-set xarray entry either takes the dying memcg's per-node lock (synchronizing with the drain) or sees LONG_MIN and walks to the parent, where the items now live.
In the Linux kernel, the following vulnerability has been resolved: netfilter: nft_tunnel: fix use-after-free on object destroy nft_tunnel_obj_destroy() calls metadata_dst_free() which directly kfree()s the metadata_dst, ignoring the dst_entry refcount. Packets that took a reference via dst_hold() in nft_tunnel_obj_eval() and are still queued (e.g. in a netem qdisc) are left with a dangling pointer. When these packets are eventually dequeued, dst_release() operates on freed memory. Replace metadata_dst_free() with dst_release() so the metadata_dst is freed only after all references are dropped. The dst subsystem already handles metadata_dst cleanup in dst_destroy() when DST_METADATA is set.
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: hci_sync: reject oversized Broadcast Announcement prepend Existing advertising instances can already hold the maximum extended advertising payload. When hci_adv_bcast_annoucement() prepends the Broadcast Announcement service data to that payload, the combined data may no longer fit in the temporary buffer used to rebuild the advertising data. Reject that case before copying the existing payload and report the failure through the device log. This keeps the existing advertising data intact and avoids overrunning the temporary buffer.
In the Linux kernel, the following vulnerability has been resolved: drm/xe/eustall: Fix drm_dev_put called before stream disable in close In xe_eu_stall_stream_close(), drm_dev_put() is called before the stream is disabled and its resources are freed. If this drops the last reference, the device structures could be freed while the subsequent cleanup code still accesses them, leading to a use-after-free. Fix this by moving drm_dev_put() after all device accesses are complete. This matches the ordering in xe_oa_release(). (cherry picked from commit 35aff528f7297e949e5e19c9cd7fd748cf1cf21c)
In the Linux kernel, the following vulnerability has been resolved: s390/bpf: Zero-extend bpf prog return values and kfunc arguments s390x ABI requires callers to zero-extend unsigned arguments and sign-extend signed arguments, and callees to zero-extend unsigned return values and sign-extend signed return values. s390 BPF JIT currently implements only sign extension. Fix this omission and implement zero extension too.
A time-of-check time-of-use (TOCTOU) race condition was found in the abrt-dbus D-Bus service's SetElement method. Between dump directory creation and post-create event execution, any local user can call SetElement to write arbitrary text files into the root-owned dump directory, bypassing package validation and allowing crashes of unpackaged binaries to survive post-create processing.
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: ISO: Fix a use-after-free of the hci_conn pointer In iso_sock_rebind_bc(), the bis pointer is cached, then the socket lock is dropped: bis = iso_pi(sk)->conn->hcon; /* Release the socket before lookups since that requires hci_dev_lock * which shall not be acquired while holding sock_lock for proper * ordering. */ release_sock(sk); hci_dev_lock(bis->hdev); During the unlocked window, could a concurrent close() destroy the connection and free the bis structure, causing hci_dev_lock(bis->hdev) to access memory after it is freed, fix this by using the hdev reference which was safely acquired via iso_conn_get_hdev().
In the Linux kernel, the following vulnerability has been resolved: drm/gem: Try to fix change_handle ioctl, attempt 4 [airlied: just added some comments on how to reenable] On-list because the cat is out of the bag and we're clearly not good enough to figure this out in private. The story thus far: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") tried to fix a race condition between the gem_close and gem_change_handle ioctls, but got a few things wrong: - There's a confusion with the local variable handle, which is actually the new handle, and so the two-stage trick was actually applied to the wrong idr slot. 7164d78559b0 ("drm/gem: fix race between change_handle and handle_delete") tried to fix that by adding yet another code block, but forgot to add the error handling. Which meant we now have two paths, both kinda wrong. - dc366607c41c ("drm: Replace old pointer to new idr") tried to apply another fix, but inconsistently, again because of the handle confusion - this would be the right fix (kinda, somewhat, it's a mess) if we'd do the two-stage approach for the new handle. Except that wasn't the intent of the original fix. We also didn't have an igt merged for the original ioctl, which is a big no-go. This was attempted to address off-list in the original bugfix, and amd QA people claimed the bug was fixed now. Very clearly that's not the case. Here's my attempt to sort this out: - Rename the local variable to new_handle, the old aliasing with args->handle is just too dangerously confusing. - Merge the gem obj lookup with the two-stage idr_replace so that we avoid getting ourselves confused there. - This means we don't have a surplus temporary reference anymore, only an inherited from the idr. A concurrent gem_close on the new_handle could steal that. Fix that with the same two-stage approach create_tail uses. This is a bit overkill as documented in the comment, but I also don't trust my ability to understand this all correctly, so go with the established pattern we have from other ioctls instead for maximum paranoia. - Adjust error paths. I've tried to make the error and success paths common, because they are identical except for which handle is removed and on which we call idr_replace to (re)install the object again. But that made things messier to read, so I've left it at the more verbose version, which unfortunately hides the symmetry in the entire code flow a bit. - While at it, also replace the 7 space indent with 1 tab. And finally, because I flat out don't trust my abilities here at all anymore: - Disable the ioctl until we have the igt situation and everything else sorted out on-list and with full consensus. v2: Sashiko noticed that I didn't handle the error path for idr_replace correctly, it must be checked with IS_ERR_OR_NULL like in gem_handle_delete. So yeah, definitely should just the existing paths 1:1 because this is endless amounts of tricky. Also add the Fixes: line for the original ioctl, I forgot that too.
In the Linux kernel, the following vulnerability has been resolved: net/sched: act_api: use RCU with deferred freeing for action lifecycle When NEWTFILTER and DELFILTER are run concurrently it is possible to create a race with an associated action. Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER: 0: mutex_lock() <-- holds the idr lock 0: rcu_read_lock() 0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR) 0: mutex_unlock() <-- releases the idr lock 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held 1: idr_remove(idr, index) <-- Action removed from IDR 1: mutex_unlock() <-- mutex released allowing us to delete the action 1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory This patch fixes the race condition between NEWTFILTER and DELFILTER by adding struct rcu_head to tc_action used in the deferral and introducing a call_rcu() in the delete path to defer the final kfree(). Note: this is a revert of commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu") but also modernization/simplification to directly use kfree_rcu(). Let's illustrate the new restored code path: 0: rcu_read_lock() 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held 1: idr_remove(idr, index) 1: mutex_unlock() 1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period 0: p = idr_find(idr, index) 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0 1: rcu_read_unlock() <-- release so freeing can run after grace period After CPU1 calls idr_remove(), the object is no longer reachable through the IDR. CPU0's subsequent idr_find() will return NULL, and even if it still held a stale pointer, the immediate kfree() is now deferred until after the RCU grace period, so no UAF can occur.
In the Linux kernel, the following vulnerability has been resolved: ALSA: timer: Fix UAF at snd_timer_user_params() At releasing a timer object, e.g. when a userspace timer (CONFIG_SND_UTIMER) gets closed and snd_timer_free() is called, it tries to detach the timer instances and release the resources. However, it's still possible that other in-flight tasks are holding the timer instance where the to-be-deleted timer object is associated, and this may lead to racy accesses. Fortunately, most of ioctls dealing with the timer instance list already have the protection with register_mutex, and this also avoids such races. But, SNDRV_TIMER_IOCTL_PARAMS isn't protected, hence the concurrent ioctl may lead to use-after-free. This patch just adds the guard with register_mutex to protect snd_timer_user_params() for covering the code path as a quick workaround. It's no hot-path but rather a rarely issued ioctl, so the performance penalty doesn't matter.
In the Linux kernel, the following vulnerability has been resolved: ipv6: anycast: insert aca into global hash under idev->lock syzbot reported a splat [1]: a slab-use-after-free in ipv6_chk_acast_addr(), which walks the global inet6_acaddr_lst[] hash under RCU and dereferences a struct ifacaddr6 that has already been freed while still linked in the hash, so a later reader walks into a dangling node. In __ipv6_dev_ac_inc() the aca is allocated with refcount 1, then aca_get() bumps it to 2 to keep it alive across the unlocked region. It is published to idev->ac_list under idev->lock, but ipv6_add_acaddr_hash() runs after write_unlock_bh(). A concurrent teardown (ipv6_ac_destroy_dev() from addrconf_ifdown(), under RTNL) can slip into that window: CPU0 __ipv6_dev_ac_inc CPU1 ipv6_ac_destroy_dev (RTNL) ------------------------------ ------------------------------------ aca_alloc() refcnt 1 aca_get() refcnt 2 write_lock_bh(idev->lock) add aca to ac_list write_unlock_bh(idev->lock) write_lock_bh(idev->lock) pull aca off ac_list write_unlock_bh(idev->lock) ipv6_del_acaddr_hash(aca) hlist_del_init_rcu() is a no-op, aca is not in the hash yet aca_put() refcnt 2->1 ipv6_add_acaddr_hash(aca) aca now inserted into the hash aca_put() refcnt 1->0 call_rcu(aca_free_rcu) -> kfree(aca) The hash removal becomes a no-op because the insertion has not happened yet, so once CPU0 inserts and drops the last reference, the aca is freed while still linked in inet6_acaddr_lst[], and readers dereference freed memory after the slab slot is reused. This window opened once RTNL stopped serializing the join path against device teardown. Move ipv6_add_acaddr_hash() inside the idev->lock section so the ac_list and hash insertions are atomic with respect to teardown: a racing remover now either misses the aca entirely or finds it in both lists. acaddr_hash_lock is now nested under idev->lock, which is acquired in softirq context, so switch all acaddr_hash_lock sites to spin_lock_bh() to avoid the irq lock inversion reported in [2]. [1] https://syzkaller.appspot.com/bug?extid=a01df04303c131efbf3a [2] https://lore.kernel.org/netdev/6a194ef7.ba3b1513.1890b4.0000.GAE@google.com/
In the Linux kernel, the following vulnerability has been resolved: f2fs: fix scheduling while atomic in decompression path [ 16.945668][ C0] Call trace: [ 16.945678][ C0] dump_backtrace+0x110/0x204 [ 16.945706][ C0] dump_stack_lvl+0x84/0xbc [ 16.945735][ C0] __schedule_bug+0xb8/0x1ac [ 16.945756][ C0] __schedule+0x724/0xbdc [ 16.945778][ C0] schedule+0x154/0x258 [ 16.945793][ C0] bit_wait_io+0x48/0xa4 [ 16.945808][ C0] out_of_line_wait_on_bit+0x114/0x198 [ 16.945824][ C0] __sync_dirty_buffer+0x1f8/0x2e8 [ 16.945853][ C0] __f2fs_commit_super+0x140/0x1f4 [ 16.945881][ C0] f2fs_commit_super+0x110/0x28c [ 16.945898][ C0] f2fs_handle_error+0x1f4/0x2f4 [ 16.945917][ C0] f2fs_decompress_cluster+0xc4/0x450 [ 16.945942][ C0] f2fs_end_read_compressed_page+0xc0/0xfc [ 16.945959][ C0] f2fs_handle_step_decompress+0x118/0x1cc [ 16.945978][ C0] f2fs_read_end_io+0x168/0x2b0 [ 16.945993][ C0] bio_endio+0x25c/0x2c8 [ 16.946015][ C0] dm_io_dec_pending+0x3e8/0x57c [ 16.946052][ C0] clone_endio+0x134/0x254 [ 16.946069][ C0] bio_endio+0x25c/0x2c8 [ 16.946084][ C0] blk_update_request+0x1d4/0x478 [ 16.946103][ C0] scsi_end_request+0x38/0x4cc [ 16.946129][ C0] scsi_io_completion+0x94/0x184 [ 16.946147][ C0] scsi_finish_command+0xe8/0x154 [ 16.946164][ C0] scsi_complete+0x90/0x1d8 [ 16.946181][ C0] blk_done_softirq+0xa4/0x11c [ 16.946198][ C0] _stext+0x184/0x614 [ 16.946214][ C0] __irq_exit_rcu+0x78/0x144 [ 16.946234][ C0] handle_domain_irq+0xd4/0x154 [ 16.946260][ C0] gic_handle_irq.33881+0x5c/0x27c [ 16.946281][ C0] call_on_irq_stack+0x40/0x70 [ 16.946298][ C0] do_interrupt_handler+0x48/0xa4 [ 16.946313][ C0] el1_interrupt+0x38/0x68 [ 16.946346][ C0] el1h_64_irq_handler+0x20/0x30 [ 16.946362][ C0] el1h_64_irq+0x78/0x7c [ 16.946377][ C0] finish_task_switch+0xc8/0x3d8 [ 16.946394][ C0] __schedule+0x600/0xbdc [ 16.946408][ C0] preempt_schedule_common+0x34/0x5c [ 16.946423][ C0] preempt_schedule+0x44/0x48 [ 16.946438][ C0] process_one_work+0x30c/0x550 [ 16.946456][ C0] worker_thread+0x414/0x8bc [ 16.946472][ C0] kthread+0x16c/0x1e0 [ 16.946486][ C0] ret_from_fork+0x10/0x20
In the Linux kernel, the following vulnerability has been resolved: RDMA/umem: Fix truncation for block sizes >= 4G When the iommu is used the linearization of the mapping can give a single block that is very large split across multiple SG entries. When __rdma_block_iter_next() reassembles the split SG entries it is overflowing the 32 bit stack values and computed the wrong DMA addresses for blocks after the truncation. Use the right types to hold DMA addresses.
In the Linux kernel, the following vulnerability has been resolved: PCI: Fix use-after-free in pci_bus_release_domain_nr() Commit c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") introduced a use-after-free bug in the bus removal cleanup. The issue was found with kfence: [ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70 [ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115): [ 19.309677] pci_bus_release_domain_nr+0x10/0x70 [ 19.309691] dw_pcie_host_deinit+0x28/0x78 [ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.309752] platform_probe+0x90/0xd8 ... [ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k [ 19.311469] allocated by task 96 on cpu 10 at 19.279323s: [ 19.311562] __kmem_cache_alloc_node+0x260/0x278 [ 19.311571] kmalloc_trace+0x24/0x30 [ 19.311580] pci_alloc_bus+0x24/0xa0 [ 19.311590] pci_register_host_bridge+0x48/0x4b8 [ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8 [ 19.311613] pci_host_probe+0x18/0xc0 [ 19.311623] dw_pcie_host_init+0x2c0/0x568 [ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194] [ 19.311647] platform_probe+0x90/0xd8 ... [ 19.311782] freed by task 96 on cpu 10 at 19.285833s: [ 19.311799] release_pcibus_dev+0x30/0x40 [ 19.311808] device_release+0x30/0x90 [ 19.311814] kobject_put+0xa8/0x120 [ 19.311832] device_unregister+0x20/0x30 [ 19.311839] pci_remove_bus+0x78/0x88 [ 19.311850] pci_remove_root_bus+0x5c/0x98 [ 19.311860] dw_pcie_host_deinit+0x28/0x78 [ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.311900] platform_probe+0x90/0xd8 ... [ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4 [ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022 [ 19.325852] Workqueue: events_unbound deferred_probe_work_func The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't directly call pci_bus_release_domain_nr(). The issue turns out to be in pci_remove_root_bus() which first calls pci_remove_bus() which frees the struct pci_bus when its struct device is released. Then pci_bus_release_domain_nr() is called and accesses the freed struct pci_bus. Reordering these fixes the issue.
In the Linux kernel, the following vulnerability has been resolved: ALSA: PCM: Fix wait queue list corruption in snd_pcm_drain() on linked streams snd_pcm_drain() uses init_waitqueue_entry which does not clear entry.prev/next, and add_wait_queue with a conditional remove_wait_queue that is skipped when to_check is no longer in the group after concurrent UNLINK. The orphaned wait entry remains on the unlinked substream sleep queue. On the next drain iteration, add_wait_queue adds the entry to a new queue while still linked on the old one, corrupting both lists. A subsequent wake_up dereferences NULL at the func pointer (mapped from the spinlock at offset 0 of the misinterpreted wait_queue_head_t), causing a kernel panic. Replace init_waitqueue_entry/add_wait_queue/conditional remove_wait_queue with init_wait_entry/prepare_to_wait/ finish_wait. init_wait_entry clears prev/next via INIT_LIST_HEAD on each iteration and sets autoremove_wake_function which auto-removes the entry on wake-up. finish_wait safely handles both the already-removed and still-queued cases.
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix linked reg delta tracking when src_reg == dst_reg Consider the case of rX += rX where src_reg and dst_reg are pointers to the same bpf_reg_state in adjust_reg_min_max_vals(). The latter first modifies the dst_reg in-place, and later in the delta tracking, the subsequent is_reg_const(src_reg)/reg_const_value(src_reg) reads the post-{add,sub} value instead of the original source. This is problematic since it sets an incorrect delta, which sync_linked_regs() then propagates to linked registers, thus creating a verifier-vs-runtime mismatch. Fix it by just skipping this corner case.
A stack-based buffer overflow flaw was found in the X.Org X server and Xwayland. _XkbSetMapChecks() declares a fixed-size stack buffer mapWidths[256] indexed by key type index. The helper function CheckKeyTypes() writes to this buffer at a client-controlled offset, allowing a stack buffer overflow. This may be used to crash the server, or for privilege escalation if the X server runs as root.
In the Linux kernel, the following vulnerability has been resolved: media: rockchip: rkcif: fix off by one bugs Change these comparisons from > vs >= to avoid accessing one element beyond the end of the arrays. While at it, use ARRAY_SIZE instead of the _MAX enum values. [fix cosmetic issues]
In the Linux kernel, the following vulnerability has been resolved: net_sched: sch_sfq: move the limit validation It is not sufficient to directly validate the limit on the data that the user passes as it can be updated based on how the other parameters are changed. Move the check at the end of the configuration update process to also catch scenarios where the limit is indirectly updated, for example with the following configurations: tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1 tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 divisor 1 This fixes the following syzkaller reported crash: ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:203:6 index 65535 is out of range for type 'struct sfq_head[128]' CPU: 1 UID: 0 PID: 3037 Comm: syz.2.16 Not tainted 6.14.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x201/0x300 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0xf5/0x120 lib/ubsan.c:429 sfq_link net/sched/sch_sfq.c:203 [inline] sfq_dec+0x53c/0x610 net/sched/sch_sfq.c:231 sfq_dequeue+0x34e/0x8c0 net/sched/sch_sfq.c:493 sfq_reset+0x17/0x60 net/sched/sch_sfq.c:518 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 tbf_reset+0x41/0x110 net/sched/sch_tbf.c:339 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 dev_reset_queue+0x100/0x1b0 net/sched/sch_generic.c:1311 netdev_for_each_tx_queue include/linux/netdevice.h:2590 [inline] dev_deactivate_many+0x7e5/0xe70 net/sched/sch_generic.c:1375
In the Linux kernel, the following vulnerability has been resolved: bonding: 3ad: implement proper RCU rules for port->aggregator syzbot found a data-race in bond_3ad_get_active_agg_info / bond_3ad_state_machine_handler [1] which hints at lack of proper RCU implementation. Add __rcu qualifier to port->aggregator, and add proper RCU API. [1] BUG: KCSAN: data-race in bond_3ad_get_active_agg_info / bond_3ad_state_machine_handler write to 0xffff88813cf5c4b0 of 8 bytes by task 36 on cpu 0: ad_port_selection_logic drivers/net/bonding/bond_3ad.c:1659 [inline] bond_3ad_state_machine_handler+0x9d5/0x2d60 drivers/net/bonding/bond_3ad.c:2569 process_one_work kernel/workqueue.c:3302 [inline] process_scheduled_works+0x4f0/0x9c0 kernel/workqueue.c:3385 worker_thread+0x58a/0x780 kernel/workqueue.c:3466 kthread+0x22a/0x280 kernel/kthread.c:436 ret_from_fork+0x146/0x330 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 read to 0xffff88813cf5c4b0 of 8 bytes by task 22063 on cpu 1: __bond_3ad_get_active_agg_info drivers/net/bonding/bond_3ad.c:2858 [inline] bond_3ad_get_active_agg_info+0x8c/0x230 drivers/net/bonding/bond_3ad.c:2881 bond_fill_info+0xe0f/0x10f0 drivers/net/bonding/bond_netlink.c:853 rtnl_link_info_fill net/core/rtnetlink.c:906 [inline] rtnl_link_fill+0x1d7/0x4e0 net/core/rtnetlink.c:927 rtnl_fill_ifinfo+0xf8e/0x1380 net/core/rtnetlink.c:2168 rtmsg_ifinfo_build_skb+0x11c/0x1b0 net/core/rtnetlink.c:4453 rtmsg_ifinfo_event net/core/rtnetlink.c:4486 [inline] rtmsg_ifinfo+0x6d/0x110 net/core/rtnetlink.c:4495 __dev_notify_flags+0x76/0x390 net/core/dev.c:9790 netif_change_flags+0xac/0xd0 net/core/dev.c:9823 do_setlink+0x905/0x2950 net/core/rtnetlink.c:3180 rtnl_group_changelink net/core/rtnetlink.c:3813 [inline] __rtnl_newlink net/core/rtnetlink.c:3981 [inline] rtnl_newlink+0xf55/0x1400 net/core/rtnetlink.c:4109 rtnetlink_rcv_msg+0x64b/0x720 net/core/rtnetlink.c:6995 netlink_rcv_skb+0x123/0x220 net/netlink/af_netlink.c:2550 rtnetlink_rcv+0x1c/0x30 net/core/rtnetlink.c:7022 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x5a8/0x680 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x5c8/0x6f0 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:787 [inline] __sock_sendmsg net/socket.c:802 [inline] ____sys_sendmsg+0x563/0x5b0 net/socket.c:2698 ___sys_sendmsg+0x195/0x1e0 net/socket.c:2752 __sys_sendmsg net/socket.c:2784 [inline] __do_sys_sendmsg net/socket.c:2789 [inline] __se_sys_sendmsg net/socket.c:2787 [inline] __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2787 x64_sys_call+0x194c/0x3020 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x12c/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0x0000000000000000 -> 0xffff88813cf5c400 Reported by Kernel Concurrency Sanitizer on: CPU: 1 UID: 0 PID: 22063 Comm: syz.0.31122 Tainted: G W syzkaller #0 PREEMPT(full) Tainted: [W]=WARN Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
In the Linux kernel, the following vulnerability has been resolved: bpf: Free reuseport cBPF prog after RCU grace period. Eulgyu Kim reported the splat below with a repro. [0] The repro sets up a UDP reuseport group with a cBPF prog and replaces it with a new one while another thread is sending a UDP packet to the group. The reuseport prog is freed by sk_reuseport_prog_free(). bpf_prog_put() is called for "e"BPF prog to destruct through multiple stages while cBPF prog is freed immediately by bpf_release_orig_filter() and bpf_prog_free(). If a reuseport prog is detached from the setsockopt() path (reuseport_attach_prog() or reuseport_detach_prog()), sk_reuseport_prog_free() is called without waiting for RCU readers to complete, resulting in various bugs. Let's defer freeing the reuseport cBPF prog after one RCU grace period. Note "e"BPF prog is safe as is unless the fast path starts to touch fields destroyed in bpf_prog_put_deferred() and __bpf_prog_put_noref(). [0]: BUG: KASAN: vmalloc-out-of-bounds in reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596 Read of size 4 at addr ffffc9000051e004 by task slowme/10208 CPU: 6 UID: 1000 PID: 10208 Comm: slowme Not tainted 7.0.0-geb7ac95ff75e #32 PREEMPT(full) Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596 udp4_lib_lookup2+0x3bc/0x950 net/ipv4/udp.c:495 __udp4_lib_lookup+0x768/0xe20 net/ipv4/udp.c:723 __udp4_lib_lookup_skb+0x297/0x390 net/ipv4/udp.c:752 __udp4_lib_rcv+0x1312/0x2620 net/ipv4/udp.c:2752 ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 __netif_receive_skb_one_core net/core/dev.c:6181 [inline] __netif_receive_skb net/core/dev.c:6294 [inline] process_backlog+0xaa4/0x1960 net/core/dev.c:6645 __napi_poll+0xae/0x340 net/core/dev.c:7709 napi_poll net/core/dev.c:7772 [inline] net_rx_action+0x5d7/0xf50 net/core/dev.c:7929 handle_softirqs+0x22b/0x870 kernel/softirq.c:622 do_softirq+0x76/0xd0 kernel/softirq.c:523 </IRQ> <TASK> __local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:924 [inline] __dev_queue_xmit+0x1dd7/0x3710 net/core/dev.c:4890 neigh_output include/net/neighbour.h:556 [inline] ip_finish_output2+0xca9/0x1070 net/ipv4/ip_output.c:237 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip_output+0x29f/0x450 net/ipv4/ip_output.c:438 ip_send_skb+0x45/0xc0 net/ipv4/ip_output.c:1508 udp_send_skb+0xb04/0x1510 net/ipv4/udp.c:1195 udp_sendmsg+0x1a71/0x2350 net/ipv4/udp.c:1485 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] __sys_sendto+0x554/0x680 net/socket.c:2206 __do_sys_sendto net/socket.c:2213 [inline] __se_sys_sendto net/socket.c:2209 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2209 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x160/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x415a2d Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f6bc31e41e8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f6bc31e4cdc RCX: 0000000000415a2d RDX: 0000000000000001 RSI: 00007f6bc31e421f RDI: 0000000000000003 RBP: 00007f6bc31e4240 R08: 00007f6bc31e4220 R09: 0000000000000010 R10: 0000000000000000 R11: ---truncated---
In the Linux kernel, the following vulnerability has been resolved: net/sched: taprio: fix use-after-free in advance_sched() on schedule switch In advance_sched(), when should_change_schedules() returns true, switch_schedules() is called to promote the admin schedule to oper. switch_schedules() queues the old oper schedule for RCU freeing via call_rcu(), but 'next' still points into an entry of the old oper schedule. The subsequent 'next->end_time = end_time' and rcu_assign_pointer(q->current_entry, next) are use-after-free. Fix this by selecting 'next' from the new oper schedule immediately after switch_schedules(), and using its pre-calculated end_time. setup_first_end_time() sets the first entry's end_time to base_time + interval when the schedule is installed, so the value is already correct. The deleted 'end_time = sched_base_time(admin)' assignment was also harmful independently: it would overwrite the new first entry's pre-calculated end_time with just base_time.
In the Linux kernel, the following vulnerability has been resolved: drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl() Two error handling issues exist in xe_exec_queue_create_ioctl(): 1. When xe_hw_engine_group_add_exec_queue() fails, the error path jumps to put_exec_queue which skips xe_exec_queue_kill(). If the VM is in preempt fence mode, xe_vm_add_compute_exec_queue() has already added the queue to the VM's compute exec queue list. Skipping the kill leaves the queue on that list, leading to a dangling pointer after the queue is freed. 2. When xa_alloc() fails after xe_hw_engine_group_add_exec_queue() has succeeded, the error path does not call xe_hw_engine_group_del_exec_queue() to remove the queue from the hw engine group list. The queue is then freed while still linked into the hw engine group, causing a use-after-free. Fix both by: - Changing the xe_hw_engine_group_add_exec_queue() failure path to jump to kill_exec_queue so that xe_exec_queue_kill() properly removes the queue from the VM's compute list. - Adding a del_hw_engine_group label before kill_exec_queue for the xa_alloc() failure path, which removes the queue from the hw engine group before proceeding with the rest of the cleanup. (cherry picked from commit 37c831f401746a45d510b312b0ed7a77b1e06ec8)
A flaw was found in virtio-win, specifically within the VirtIO Block (BLK) device. When the device undergoes a reset, it fails to properly manage memory, resulting in a use-after-free vulnerability. This issue could allow a local attacker to corrupt system memory, potentially leading to system instability or unexpected behavior.
In the Linux kernel, the following vulnerability has been resolved: sched/psi: fix race between file release and pressure write A potential race condition exists between pressure write and cgroup file release regarding the priv member of struct kernfs_open_file, which triggers the uaf reported in [1]. Consider the following scenario involving execution on two separate CPUs: CPU0 CPU1 ==== ==== vfs_rmdir() kernfs_iop_rmdir() cgroup_rmdir() cgroup_kn_lock_live() cgroup_destroy_locked() cgroup_addrm_files() cgroup_rm_file() kernfs_remove_by_name() kernfs_remove_by_name_ns() vfs_write() __kernfs_remove() new_sync_write() kernfs_drain() kernfs_fop_write_iter() kernfs_drain_open_files() cgroup_file_write() kernfs_release_file() pressure_write() cgroup_file_release() ctx = of->priv; kfree(ctx); of->priv = NULL; cgroup_kn_unlock() cgroup_kn_lock_live() cgroup_get(cgrp) cgroup_kn_unlock() if (ctx->psi.trigger) // here, trigger uaf for ctx, that is of->priv The cgroup_rmdir() is protected by the cgroup_mutex, it also safeguards the memory deallocation of of->priv performed within cgroup_file_release(). However, the operations involving of->priv executed within pressure_write() are not entirely covered by the protection of cgroup_mutex. Consequently, if the code in pressure_write(), specifically the section handling the ctx variable executes after cgroup_file_release() has completed, a uaf vulnerability involving of->priv is triggered. Therefore, the issue can be resolved by extending the scope of the cgroup_mutex lock within pressure_write() to encompass all code paths involving of->priv, thereby properly synchronizing the race condition occurring between cgroup_file_release() and pressure_write(). And, if an live kn lock can be successfully acquired while executing the pressure write operation, it indicates that the cgroup deletion process has not yet reached its final stage; consequently, the priv pointer within open_file cannot be NULL. Therefore, the operation to retrieve the ctx value must be moved to a point *after* the live kn lock has been successfully acquired. In another situation, specifically after entering cgroup_kn_lock_live() but before acquiring cgroup_mutex, there exists a different class of race condition: CPU0: write memory.pressure CPU1: write cgroup.pressure=0 =========================== ============================= kernfs_fop_write_iter() kernfs_get_active_of(of) pressure_write() cgroup_kn_lock_live(memory.pressure) cgroup_tryget(cgrp) kernfs_break_active_protection(kn) ... blocks on cgroup_mutex cgroup_pressure_write() cgroup_kn_lock_live(cgroup.pressure) cgroup_file_show(memory.pressure, false) kernfs_show(false) kernfs_drain_open_files() cgroup_file_release(of) kfree(ctx) of->priv = NULL cgroup_kn_unlock() ... acquires cgroup_mutex ctx = of->priv; // may now be NULL if (ctx->psi.trigger) // NULL dereference Consequently, there is a possibility that of->priv is NULL, the pressure write needs to check for this. Now that the scope of the cgroup_mutex has been expanded, the original explicit cgroup_get/put operations are no longer necessary, this is because acquiring/releasing the live kn lock inherently executes a cgroup get/put operation. [1] BUG: KASAN: slab-use-after-free in pressure_write+0xa4/0x210 kernel/cgroup/cgroup.c:4011 Call Trace: pressure_write+0xa4/0x210 kernel/cgroup/cgroup.c:4011 cgroup_file_write+0x36f/0x790 kernel/cgroup/cgroup.c:43 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: xfrm: espintcp: do not reuse an in-progress partial send espintcp keeps a single in-flight transmit in ctx->partial. Before building a new sk_msg, espintcp_sendmsg() first tries to flush that state through espintcp_push_msgs(). For blocking callers, espintcp_push_msgs() may return success even when the previous partial send is still pending. espintcp_sendmsg() would then reinitialize emsg->skmsg and reuse ctx->partial while the old transfer still owns that state. Do not rebuild the send message when ctx->partial is still in progress. If espintcp_push_msgs() returns with emsg->len still set, fail the new send instead of overwriting the live partial state. This is a memory-safety fix: reusing the live partial-send state can leave a stale offset attached to a new sk_msg and lead to an out-of- bounds read in the send path. tcp_sendmsg_locked() already handles waiting for send buffer memory, so the fix here is just to preserve espintcp's one-message-at-a-time transmit state.
In the Linux kernel, the following vulnerability has been resolved: futex: Drop CLONE_THREAD requirement for private default hash alloc Currently need_futex_hash_allocate_default() depends on strict pthread semantics, abusing CLONE_THREAD. This breaks the non-concurrency assumptions when doing the mm->futex_ref pcpu allocations, leading to bugs[0] when sharing the mm in other ways; ie: BUG: KASAN: slab-use-after-free in futex_hash_put ... where the +1 bias can end up on a percpu counter that mm->futex_ref no longer points at. Loosen the check to cover any CLONE_VM clone, except vfork(). Excluding vfork keeps the existing paths untouched (no overhead), and we can't race in the first place: either the parent is suspended and the child runs alone, or mm->futex_ref is already allocated from an earlier CLONE_VM.
In the Linux kernel, the following vulnerability has been resolved: bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars When regsafe() compares two scalar registers that both carry BPF_ADD_CONST, check_scalar_ids() maps their full compound id (aka base | BPF_ADD_CONST flag) as one idmap entry. However, it never verifies that the underlying base ids, that is, with the flag stripped are consistent with existing idmap mappings. This allows construction of two verifier states where the old state has R3 = R2 + 10 (both sharing base id A) while the current state has R3 = R4 + 10 (base id C, unrelated to R2). The idmap creates two independent entries: A->B (for R2) and A|flag->C|flag (for R3), without catching that A->C conflicts with A->B. State pruning then incorrectly succeeds. Fix this by additionally verifying base ID mapping consistency whenever BPF_ADD_CONST is set: after mapping the compound ids, also invoke check_ids() on the base IDs (flag bits stripped). This ensures that if A was already mapped to B from comparing the source register, any ADD_CONST derivative must also derive from B, not an unrelated C.
In the Linux kernel, the following vulnerability has been resolved: RDMA: During rereg_mr ensure that REREG_ACCESS is compatible If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be re-evaluated to ensure it is properly pinned as RW. Since the umem is hidden inside each driver's mr struct add a ib_umem_check_rereg() function that each driver has to call before processing IB_MR_REREG_ACCESS. mlx4 has to retain its duplicate ib_access_writable check because it implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items in place sequentially while the MR is live, so it will continue to not support this combination.
In the Linux kernel, the following vulnerability has been resolved: ice: fix double-free of tx_buf skb If ice_tso() or ice_tx_csum() fail, the error path in ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points to it and is marked as valid (ICE_TX_BUF_SKB). 'next_to_use' remains unchanged, so the potential problem will likely fix itself when the next packet is transmitted and the tx_buf gets overwritten. But if there is no next packet and the interface is brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf() will find the tx_buf and free the skb for the second time. The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error path, so that ice_unmap_and_free_tx_buf(). Move the initialization of 'first' up, to ensure it's already valid in case we hit the linearization error path. The bug was spotted by AI while I had it looking for something else. It also proposed an initial version of the patch. I reproduced the bug and tested the fix by adding code to inject failures, on a build with KASAN. I looked for similar bugs in related Intel drivers and did not find any.
In the Linux kernel, the following vulnerability has been resolved: batman-adv: fix tp_meter counter underflow during shutdown batadv_tp_sender_shutdown() unconditionally decrements the "sending" atomic counter. If multiple paths (e.g. timeout, user cancel, and normal finish) call this function, the counter can underflow to -1. Since the sender logic treats any non-zero value as "still sending", a negative value causes the sender kthread to loop indefinitely. This leads to a use-after-free when the interface is removed while the zombie thread is still active. Fix this by using atomic_xchg() to ensure the counter only transitions from 1 to 0 once. [sven: added missing change in batadv_tp_send]
In the Linux kernel, the following vulnerability has been resolved: bpf, arm64: Fix off-by-one in check_imm signed range check check_imm(bits, imm) is used in the arm64 BPF JIT to verify that a branch displacement (in arm64 instruction units) fits into the signed N-bit immediate field of a B, B.cond or CBZ/CBNZ encoding before it is handed to the encoder. The macro currently tests for (imm > 0 && imm >> bits) || (imm < 0 && ~imm >> bits) which admits values in [-2^N, 2^N) — effectively a signed (N+1)-bit range. A signed N-bit field only holds [-2^(N-1), 2^(N-1)), so the check admits one extra bit of range on each side. In particular, for check_imm19(), values in [2^18, 2^19) slip past the check but do not fit into the 19-bit signed imm19 field of B.cond. aarch64_insn_encode_immediate() then masks the raw value into the 19-bit field, setting bit 18 (the sign bit) and flipping a forward branch into a backward one. Same class of issue exists for check_imm26() and the B/BL encoding. Shift by (bits - 1) instead of bits so the actual signed N-bit range is enforced.
In the Linux kernel, the following vulnerability has been resolved: ip6_vti: set netns_immutable on the fallback device. john1988 and Noam Rathaus reported that vti6_init_net() does not set the netns_immutable flag on the per-netns fallback tunnel device (ip6_vti0). Other similar tunnel drivers (like ip6_tunnel, sit, ip6_gre, and ip_tunnel) correctly set this flag during their fallback device initialization to prevent them from being moved to another network namespace.
In the Linux kernel, the following vulnerability has been resolved: virt: sev-guest: Do not use host-controlled page order in cleanup path When issuing an extended guest request (SVM_VMGEXIT_EXT_GUEST_REQUEST), get_ext_report() allocates a buffer to retrieve a certificate blob from the host, keeping track of its size in report_req->certs_len. However, the host may return SNP_GUEST_VMM_ERR_INVALID_LEN, indicating an invalid buffer size, as well as the expected length of such buffer. get_ext_report() subsequently updates report_req->certs_len with the host-controlled value, and cleans up the buffer by computing a page order from such value. This is incorrect, as the host-provided length may not match the page order of the original allocation, potentially resulting in corruption in the page allocator. Fix this by using alloc_pages_exact() instead, and reusing @npages to compute the size passed to free_pages_exact(). For consistency, also use @npages to compute the size when allocating the pages, even though this last change has no functional effect.
In the Linux kernel, the following vulnerability has been resolved: drm/msm: Fix VM_BIND UNMAP locking Wrong argument meant that the objs involved in UNMAP ops were not always getting locked. Since _NO_SHARE objs share a common resv with the VM (which is always locked) this would only show up with non-_NO_SHARE BOs. Patchwork: https://patchwork.freedesktop.org/patch/713898/
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix ld_{abs,ind} failure path analysis in subprogs Usage of ld_{abs,ind} instructions got extended into subprogs some time ago via commit 09b28d76eac4 ("bpf: Add abnormal return checks."). These are only allowed in subprograms when the latter are BTF annotated and have scalar return types. The code generator in bpf_gen_ld_abs() has an abnormal exit path (r0=0 + exit) from legacy cBPF times. While the enforcement is on scalar return types, the verifier must also simulate the path of abnormal exit if the packet data load via ld_{abs,ind} failed. This is currently not the case. Fix it by having the verifier simulate both success and failure paths, and extend it in similar ways as we do for tail calls. The success path (r0=unknown, continue to next insn) is pushed onto stack for later validation and the r0=0 and return to the caller is done on the fall-through side.