In the Linux kernel, the following vulnerability has been resolved: rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed Registering a kprobe on __rcu_irq_enter_check_tick() can cause kernel stack overflow as shown below. This issue can be reproduced by enabling CONFIG_NO_HZ_FULL and booting the kernel with argument "nohz_full=", and then giving the following commands at the shell prompt: # cd /sys/kernel/tracing/ # echo 'p:mp1 __rcu_irq_enter_check_tick' >> kprobe_events # echo 1 > events/kprobes/enable This commit therefore adds __rcu_irq_enter_check_tick() to the kprobes blacklist using NOKPROBE_SYMBOL(). Insufficient stack space to handle exception! ESR: 0x00000000f2000004 -- BRK (AArch64) FAR: 0x0000ffffccf3e510 Task stack: [0xffff80000ad30000..0xffff80000ad38000] IRQ stack: [0xffff800008050000..0xffff800008058000] Overflow stack: [0xffff089c36f9f310..0xffff089c36fa0310] CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19 Hardware name: linux,dummy-virt (DT) pstate: 400003c5 (nZcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __rcu_irq_enter_check_tick+0x0/0x1b8 lr : ct_nmi_enter+0x11c/0x138 sp : ffff80000ad30080 x29: ffff80000ad30080 x28: ffff089c82e20000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff089c02a8d100 x24: 0000000000000000 x23: 00000000400003c5 x22: 0000ffffccf3e510 x21: ffff089c36fae148 x20: ffff80000ad30120 x19: ffffa8da8fcce148 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffa8da8e44ea6c x14: ffffa8da8e44e968 x13: ffffa8da8e03136c x12: 1fffe113804d6809 x11: ffff6113804d6809 x10: 0000000000000a60 x9 : dfff800000000000 x8 : ffff089c026b404f x7 : 00009eec7fb297f7 x6 : 0000000000000001 x5 : ffff80000ad30120 x4 : dfff800000000000 x3 : ffffa8da8e3016f4 x2 : 0000000000000003 x1 : 0000000000000000 x0 : 0000000000000000 Kernel panic - not syncing: kernel stack overflow CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0xf8/0x108 show_stack+0x20/0x30 dump_stack_lvl+0x68/0x84 dump_stack+0x1c/0x38 panic+0x214/0x404 add_taint+0x0/0xf8 panic_bad_stack+0x144/0x160 handle_bad_stack+0x38/0x58 __bad_stack+0x78/0x7c __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 [...] el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 el1_interrupt+0x28/0x60 el1h_64_irq_handler+0x18/0x28 el1h_64_irq+0x64/0x68 __ftrace_set_clr_event_nolock+0x98/0x198 __ftrace_set_clr_event+0x58/0x80 system_enable_write+0x144/0x178 vfs_write+0x174/0x738 ksys_write+0xd0/0x188 __arm64_sys_write+0x4c/0x60 invoke_syscall+0x64/0x180 el0_svc_common.constprop.0+0x84/0x160 do_el0_svc+0x48/0xe8 el0_svc+0x34/0xd0 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x190/0x194 SMP: stopping secondary CPUs Kernel Offset: 0x28da86000000 from 0xffff800008000000 PHYS_OFFSET: 0xfffff76600000000 CPU features: 0x00000,01a00100,0000421b Memory Limit: none
In the Linux kernel, the following vulnerability has been resolved: afs: Fix lock recursion afs_wake_up_async_call() can incur lock recursion. The problem is that it is called from AF_RXRPC whilst holding the ->notify_lock, but it tries to take a ref on the afs_call struct in order to pass it to a work queue - but if the afs_call is already queued, we then have an extraneous ref that must be put... calling afs_put_call() may call back down into AF_RXRPC through rxrpc_kernel_shutdown_call(), however, which might try taking the ->notify_lock again. This case isn't very common, however, so defer it to a workqueue. The oops looks something like: BUG: spinlock recursion on CPU#0, krxrpcio/7001/1646 lock: 0xffff888141399b30, .magic: dead4ead, .owner: krxrpcio/7001/1646, .owner_cpu: 0 CPU: 0 UID: 0 PID: 1646 Comm: krxrpcio/7001 Not tainted 6.12.0-rc2-build3+ #4351 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 do_raw_spin_lock+0x3c/0x90 rxrpc_kernel_shutdown_call+0x83/0xb0 afs_put_call+0xd7/0x180 rxrpc_notify_socket+0xa0/0x190 rxrpc_input_split_jumbo+0x198/0x1d0 rxrpc_input_data+0x14b/0x1e0 ? rxrpc_input_call_packet+0xc2/0x1f0 rxrpc_input_call_event+0xad/0x6b0 rxrpc_input_packet_on_conn+0x1e1/0x210 rxrpc_input_packet+0x3f2/0x4d0 rxrpc_io_thread+0x243/0x410 ? __pfx_rxrpc_io_thread+0x10/0x10 kthread+0xcf/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x24/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
In the Linux kernel, the following vulnerability has been resolved: nbd: fix incomplete validation of ioctl arg We tested and found an alarm caused by nbd_ioctl arg without verification. The UBSAN warning calltrace like below: UBSAN: Undefined behaviour in fs/buffer.c:1709:35 signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long long int' CPU: 3 PID: 2523 Comm: syz-executor.0 Not tainted 4.19.90 #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78 show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x170/0x1dc lib/dump_stack.c:118 ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161 handle_overflow+0x188/0x1dc lib/ubsan.c:192 __ubsan_handle_sub_overflow+0x34/0x44 lib/ubsan.c:206 __block_write_full_page+0x94c/0xa20 fs/buffer.c:1709 block_write_full_page+0x1f0/0x280 fs/buffer.c:2934 blkdev_writepage+0x34/0x40 fs/block_dev.c:607 __writepage+0x68/0xe8 mm/page-writeback.c:2305 write_cache_pages+0x44c/0xc70 mm/page-writeback.c:2240 generic_writepages+0xdc/0x148 mm/page-writeback.c:2329 blkdev_writepages+0x2c/0x38 fs/block_dev.c:2114 do_writepages+0xd4/0x250 mm/page-writeback.c:2344 The reason for triggering this warning is __block_write_full_page() -> i_size_read(inode) - 1 overflow. inode->i_size is assigned in __nbd_ioctl() -> nbd_set_size() -> bytesize. We think it is necessary to limit the size of arg to prevent errors. Moreover, __nbd_ioctl() -> nbd_add_socket(), arg will be cast to int. Assuming the value of arg is 0x80000000000000001) (on a 64-bit machine), it will become 1 after the coercion, which will return unexpected results. Fix it by adding checks to prevent passing in too large numbers.
An issue was discovered in the Linux kernel before 5.8. lib/nlattr.c allows attackers to cause a denial of service (unbounded recursion) via a nested Netlink policy with a back reference.
In the Linux kernel, the following vulnerability has been resolved: powercap: arm_scmi: Remove recursion while parsing zones Powercap zones can be defined as arranged in a hierarchy of trees and when registering a zone with powercap_register_zone(), the kernel powercap subsystem expects this to happen starting from the root zones down to the leaves; on the other side, de-registration by powercap_deregister_zone() must begin from the leaf zones. Available SCMI powercap zones are retrieved dynamically from the platform at probe time and, while any defined hierarchy between the zones is described properly in the zones descriptor, the platform returns the availables zones with no particular well-defined order: as a consequence, the trees possibly composing the hierarchy of zones have to be somehow walked properly to register the retrieved zones from the root. Currently the ARM SCMI Powercap driver walks the zones using a recursive algorithm; this approach, even though correct and tested can lead to kernel stack overflow when processing a returned hierarchy of zones composed by particularly high trees. Avoid possible kernel stack overflow by substituting the recursive approach with an iterative one supported by a dynamically allocated stack-like data structure.
check_input_term in sound/usb/mixer.c in the Linux kernel through 5.2.9 mishandles recursion, leading to kernel stack exhaustion.
In the Linux kernel, the following vulnerability has been resolved: KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest() In commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") kvm_start_guest() became idle_kvm_start_guest(). The old code allocated a stack frame on the emergency stack, but didn't use the frame to store anything, and also didn't store anything in its caller's frame. idle_kvm_start_guest() on the other hand is written more like a normal C function, it creates a frame on entry, and also stores CR/LR into its callers frame (per the ABI). The problem is that there is no caller frame on the emergency stack. The emergency stack for a given CPU is allocated with: paca_ptrs[i]->emergency_sp = alloc_stack(limit, i) + THREAD_SIZE; So emergency_sp actually points to the first address above the emergency stack allocation for a given CPU, we must not store above it without first decrementing it to create a frame. This is different to the regular kernel stack, paca->kstack, which is initialised to point at an initial frame that is ready to use. idle_kvm_start_guest() stores the backchain, CR and LR all of which write outside the allocation for the emergency stack. It then creates a stack frame and saves the non-volatile registers. Unfortunately the frame it creates is not large enough to fit the non-volatiles, and so the saving of the non-volatile registers also writes outside the emergency stack allocation. The end result is that we corrupt whatever is at 0-24 bytes, and 112-248 bytes above the emergency stack allocation. In practice this has gone unnoticed because the memory immediately above the emergency stack happens to be used for other stack allocations, either another CPUs mc_emergency_sp or an IRQ stack. See the order of calls to irqstack_early_init() and emergency_stack_init(). The low addresses of another stack are the top of that stack, and so are only used if that stack is under extreme pressue, which essentially never happens in practice - and if it did there's a high likelyhood we'd crash due to that stack overflowing. Still, we shouldn't be corrupting someone else's stack, and it is purely luck that we aren't corrupting something else. To fix it we save CR/LR into the caller's frame using the existing r1 on entry, we then create a SWITCH_FRAME_SIZE frame (which has space for pt_regs) on the emergency stack with the backchain pointing to the existing stack, and then finally we switch to the new frame on the emergency stack.
In the Linux kernel, the following vulnerability has been resolved: tracing/osnoise: Fix crash in timerlat_dump_stack() We have observed kernel panics when using timerlat with stack saving, with the following dmesg output: memcpy: detected buffer overflow: 88 byte write of buffer size 0 WARNING: CPU: 2 PID: 8153 at lib/string_helpers.c:1032 __fortify_report+0x55/0xa0 CPU: 2 UID: 0 PID: 8153 Comm: timerlatu/2 Kdump: loaded Not tainted 6.15.3-200.fc42.x86_64 #1 PREEMPT(lazy) Call Trace: <TASK> ? trace_buffer_lock_reserve+0x2a/0x60 __fortify_panic+0xd/0xf __timerlat_dump_stack.cold+0xd/0xd timerlat_dump_stack.part.0+0x47/0x80 timerlat_fd_read+0x36d/0x390 vfs_read+0xe2/0x390 ? syscall_exit_to_user_mode+0x1d5/0x210 ksys_read+0x73/0xe0 do_syscall_64+0x7b/0x160 ? exc_page_fault+0x7e/0x1a0 entry_SYSCALL_64_after_hwframe+0x76/0x7e __timerlat_dump_stack() constructs the ftrace stack entry like this: struct stack_entry *entry; ... memcpy(&entry->caller, fstack->calls, size); entry->size = fstack->nr_entries; Since commit e7186af7fb26 ("tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure"), struct stack_entry marks its caller field with __counted_by(size). At the time of the memcpy, entry->size contains garbage from the ringbuffer, which under some circumstances is zero, triggering a kernel panic by buffer overflow. Populate the size field before the memcpy so that the out-of-bounds check knows the correct size. This is analogous to __ftrace_trace_stack().
In the Linux kernel, the following vulnerability has been resolved: bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener A listening socket linked to a sockmap has its sk_prot overridden. It points to one of the struct proto variants in tcp_bpf_prots. The variant depends on the socket's family and which sockmap programs are attached. A child socket cloned from a TCP listener initially inherits their sk_prot. But before cloning is finished, we restore the child's proto to the listener's original non-tcp_bpf_prots one. This happens in tcp_create_openreq_child -> tcp_bpf_clone. Today, in tcp_bpf_clone we detect if the child's proto should be restored by checking only for the TCP_BPF_BASE proto variant. This is not correct. The sk_prot of listening socket linked to a sockmap can point to to any variant in tcp_bpf_prots. If the listeners sk_prot happens to be not the TCP_BPF_BASE variant, then the child socket unintentionally is left if the inherited sk_prot by tcp_bpf_clone. This leads to issues like infinite recursion on close [1], because the child state is otherwise not set up for use with tcp_bpf_prot operations. Adjust the check in tcp_bpf_clone to detect all of tcp_bpf_prots variants. Note that it wouldn't be sufficient to check the socket state when overriding the sk_prot in tcp_bpf_update_proto in order to always use the TCP_BPF_BASE variant for listening sockets. Since commit b8b8315e39ff ("bpf, sockmap: Remove unhash handler for BPF sockmap usage") it is possible for a socket to transition to TCP_LISTEN state while already linked to a sockmap, e.g. connect() -> insert into map -> connect(AF_UNSPEC) -> listen(). [1]: https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/
In the Linux kernel, the following vulnerability has been resolved: riscv: VMAP_STACK overflow detection thread-safe commit 31da94c25aea ("riscv: add VMAP_STACK overflow detection") added support for CONFIG_VMAP_STACK. If overflow is detected, CPU switches to `shadow_stack` temporarily before switching finally to per-cpu `overflow_stack`. If two CPUs/harts are racing and end up in over flowing kernel stack, one or both will end up corrupting each other state because `shadow_stack` is not per-cpu. This patch optimizes per-cpu overflow stack switch by directly picking per-cpu `overflow_stack` and gets rid of `shadow_stack`. Following are the changes in this patch - Defines an asm macro to obtain per-cpu symbols in destination register. - In entry.S, when overflow is detected, per-cpu overflow stack is located using per-cpu asm macro. Computing per-cpu symbol requires a temporary register. x31 is saved away into CSR_SCRATCH (CSR_SCRATCH is anyways zero since we're in kernel). Please see Links for additional relevant disccussion and alternative solution. Tested by `echo EXHAUST_STACK > /sys/kernel/debug/provoke-crash/DIRECT` Kernel crash log below Insufficient stack space to handle exception!/debug/provoke-crash/DIRECT Task stack: [0xff20000010a98000..0xff20000010a9c000] Overflow stack: [0xff600001f7d98370..0xff600001f7d99370] CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) epc : __memset+0x60/0xfc ra : recursive_loop+0x48/0xc6 [lkdtm] epc : ffffffff808de0e4 ra : ffffffff0163a752 sp : ff20000010a97e80 gp : ffffffff815c0330 tp : ff600000820ea280 t0 : ff20000010a97e88 t1 : 000000000000002e t2 : 3233206874706564 s0 : ff20000010a982b0 s1 : 0000000000000012 a0 : ff20000010a97e88 a1 : 0000000000000000 a2 : 0000000000000400 a3 : ff20000010a98288 a4 : 0000000000000000 a5 : 0000000000000000 a6 : fffffffffffe43f0 a7 : 00007fffffffffff s2 : ff20000010a97e88 s3 : ffffffff01644680 s4 : ff20000010a9be90 s5 : ff600000842ba6c0 s6 : 00aaaaaac29e42b0 s7 : 00fffffff0aa3684 s8 : 00aaaaaac2978040 s9 : 0000000000000065 s10: 00ffffff8a7cad10 s11: 00ffffff8a76a4e0 t3 : ffffffff815dbaf4 t4 : ffffffff815dbaf4 t5 : ffffffff815dbab8 t6 : ff20000010a9bb48 status: 0000000200000120 badaddr: ff20000010a97e88 cause: 000000000000000f Kernel panic - not syncing: Kernel stack overflow CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80006754>] dump_backtrace+0x30/0x38 [<ffffffff808de798>] show_stack+0x40/0x4c [<ffffffff808ea2a8>] dump_stack_lvl+0x44/0x5c [<ffffffff808ea2d8>] dump_stack+0x18/0x20 [<ffffffff808dec06>] panic+0x126/0x2fe [<ffffffff800065ea>] walk_stackframe+0x0/0xf0 [<ffffffff0163a752>] recursive_loop+0x48/0xc6 [lkdtm] SMP: stopping secondary CPUs ---[ end Kernel panic - not syncing: Kernel stack overflow ]---
In the Linux kernel, the following vulnerability has been resolved: eventpoll: Fix semi-unbounded recursion Ensure that epoll instances can never form a graph deeper than EP_MAX_NESTS+1 links. Currently, ep_loop_check_proc() ensures that the graph is loop-free and does some recursion depth checks, but those recursion depth checks don't limit the depth of the resulting tree for two reasons: - They don't look upwards in the tree. - If there are multiple downwards paths of different lengths, only one of the paths is actually considered for the depth check since commit 28d82dc1c4ed ("epoll: limit paths"). Essentially, the current recursion depth check in ep_loop_check_proc() just serves to prevent it from recursing too deeply while checking for loops. A more thorough check is done in reverse_path_check() after the new graph edge has already been created; this checks, among other things, that no paths going upwards from any non-epoll file with a length of more than 5 edges exist. However, this check does not apply to non-epoll files. As a result, it is possible to recurse to a depth of at least roughly 500, tested on v6.15. (I am unsure if deeper recursion is possible; and this may have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion problem").) To fix it: 1. In ep_loop_check_proc(), note the subtree depth of each visited node, and use subtree depths for the total depth calculation even when a subtree has already been visited. 2. Add ep_get_upwards_depth_proc() for similarly determining the maximum depth of an upwards walk. 3. In ep_loop_check(), use these values to limit the total path length between epoll nodes to EP_MAX_NESTS edges.
In the Linux kernel, the following vulnerability has been resolved: crypto: hisilicon/qm - increase the memory of local variables Increase the buffer to prevent stack overflow by fuzz test. The maximum length of the qos configuration buffer is 256 bytes. Currently, the value of the 'val buffer' is only 32 bytes. The sscanf does not check the dest memory length. So the 'val buffer' may stack overflow.
In the Linux kernel, the following vulnerability has been resolved: perf: Improve missing SIGTRAP checking To catch missing SIGTRAP we employ a WARN in __perf_event_overflow(), which fires if pending_sigtrap was already set: returning to user space without consuming pending_sigtrap, and then having the event fire again would re-enter the kernel and trigger the WARN. This, however, seemed to miss the case where some events not associated with progress in the user space task can fire and the interrupt handler runs before the IRQ work meant to consume pending_sigtrap (and generate the SIGTRAP). syzbot gifted us this stack trace: | WARNING: CPU: 0 PID: 3607 at kernel/events/core.c:9313 __perf_event_overflow | Modules linked in: | CPU: 0 PID: 3607 Comm: syz-executor100 Not tainted 6.1.0-rc2-syzkaller-00073-g88619e77b33d #0 | Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 | RIP: 0010:__perf_event_overflow+0x498/0x540 kernel/events/core.c:9313 | <...> | Call Trace: | <TASK> | perf_swevent_hrtimer+0x34f/0x3c0 kernel/events/core.c:10729 | __run_hrtimer kernel/time/hrtimer.c:1685 [inline] | __hrtimer_run_queues+0x1c6/0xfb0 kernel/time/hrtimer.c:1749 | hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 | local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1096 [inline] | __sysvec_apic_timer_interrupt+0x17c/0x640 arch/x86/kernel/apic/apic.c:1113 | sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1107 | asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:649 | <...> | </TASK> In this case, syzbot produced a program with event type PERF_TYPE_SOFTWARE and config PERF_COUNT_SW_CPU_CLOCK. The hrtimer manages to fire again before the IRQ work got a chance to run, all while never having returned to user space. Improve the WARN to check for real progress in user space: approximate this by storing a 32-bit hash of the current IP into pending_sigtrap, and if an event fires while pending_sigtrap still matches the previous IP, we assume no progress (false negatives are possible given we could return to user space and trigger again on the same IP).
In the Linux kernel, the following vulnerability has been resolved: powerpc/perf: Optimize clearing the pending PMI and remove WARN_ON for PMI check in power_pmu_disable commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") added a new function "pmi_irq_pending" in hw_irq.h. This function is to check if there is a PMI marked as pending in Paca (PACA_IRQ_PMI).This is used in power_pmu_disable in a WARN_ON. The intention here is to provide a warning if there is PMI pending, but no counter is found overflown. During some of the perf runs, below warning is hit: WARNING: CPU: 36 PID: 0 at arch/powerpc/perf/core-book3s.c:1332 power_pmu_disable+0x25c/0x2c0 Modules linked in: ----- NIP [c000000000141c3c] power_pmu_disable+0x25c/0x2c0 LR [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 Call Trace: [c000000baffcfb90] [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 (unreliable) [c000000baffcfc10] [c0000000003e2f8c] perf_pmu_disable+0x4c/0x60 [c000000baffcfc30] [c0000000003e3344] group_sched_out.part.124+0x44/0x100 [c000000baffcfc80] [c0000000003e353c] __perf_event_disable+0x13c/0x240 [c000000baffcfcd0] [c0000000003dd334] event_function+0xc4/0x140 [c000000baffcfd20] [c0000000003d855c] remote_function+0x7c/0xa0 [c000000baffcfd50] [c00000000026c394] flush_smp_call_function_queue+0xd4/0x300 [c000000baffcfde0] [c000000000065b24] smp_ipi_demux_relaxed+0xa4/0x100 [c000000baffcfe20] [c0000000000cb2b0] xive_muxed_ipi_action+0x20/0x40 [c000000baffcfe40] [c000000000207c3c] __handle_irq_event_percpu+0x8c/0x250 [c000000baffcfee0] [c000000000207e2c] handle_irq_event_percpu+0x2c/0xa0 [c000000baffcff10] [c000000000210a04] handle_percpu_irq+0x84/0xc0 [c000000baffcff40] [c000000000205f14] generic_handle_irq+0x54/0x80 [c000000baffcff60] [c000000000015740] __do_irq+0x90/0x1d0 [c000000baffcff90] [c000000000016990] __do_IRQ+0xc0/0x140 [c0000009732f3940] [c000000bafceaca8] 0xc000000bafceaca8 [c0000009732f39d0] [c000000000016b78] do_IRQ+0x168/0x1c0 [c0000009732f3a00] [c0000000000090c8] hardware_interrupt_common_virt+0x218/0x220 This means that there is no PMC overflown among the active events in the PMU, but there is a PMU pending in Paca. The function "any_pmc_overflown" checks the PMCs on active events in cpuhw->n_events. Code snippet: <<>> if (any_pmc_overflown(cpuhw)) clear_pmi_irq_pending(); else WARN_ON(pmi_irq_pending()); <<>> Here the PMC overflown is not from active event. Example: When we do perf record, default cycles and instructions will be running on PMC6 and PMC5 respectively. It could happen that overflowed event is currently not active and pending PMI is for the inactive event. Debug logs from trace_printk: <<>> any_pmc_overflown: idx is 5: pmc value is 0xd9a power_pmu_disable: PMC1: 0x0, PMC2: 0x0, PMC3: 0x0, PMC4: 0x0, PMC5: 0xd9a, PMC6: 0x80002011 <<>> Here active PMC (from idx) is PMC5 , but overflown PMC is PMC6(0x80002011). When we handle PMI interrupt for such cases, if the PMC overflown is from inactive event, it will be ignored. Reference commit: commit bc09c219b2e6 ("powerpc/perf: Fix finding overflowed PMC in interrupt") Patch addresses two changes: 1) Fix 1 : Removal of warning ( WARN_ON(pmi_irq_pending()); ) We were printing warning if no PMC is found overflown among active PMU events, but PMI pending in PACA. But this could happen in cases where PMC overflown is not in active PMC. An inactive event could have caused the overflow. Hence the warning is not needed. To know pending PMI is from an inactive event, we need to loop through all PMC's which will cause more SPR reads via mfspr and increase in context switch. Also in existing function: perf_event_interrupt, already we ignore PMI's overflown when it is from an inactive PMC. 2) Fix 2: optimization in clearing pending PMI. Currently we check for any active PMC overflown before clearing PMI pending in Paca. This is causing additional SP ---truncated---
In the Linux kernel, the following vulnerability has been resolved: vsock: fix recursive ->recvmsg calls After a vsock socket has been added to a BPF sockmap, its prot->recvmsg has been replaced with vsock_bpf_recvmsg(). Thus the following recursiion could happen: vsock_bpf_recvmsg() -> __vsock_recvmsg() -> vsock_connectible_recvmsg() -> prot->recvmsg() -> vsock_bpf_recvmsg() again We need to fix it by calling the original ->recvmsg() without any BPF sockmap logic in __vsock_recvmsg().
In the Linux kernel, the following vulnerability has been resolved: block: avoid possible overflow for chunk_sectors check in blk_stack_limits() In blk_stack_limits(), we check that the t->chunk_sectors value is a multiple of the t->physical_block_size value. However, by finding the chunk_sectors value in bytes, we may overflow the unsigned int which holds chunk_sectors, so change the check to be based on sectors.
In the Linux kernel, the following vulnerability has been resolved: fbdev: omapfb: Add 'plane' value check Function dispc_ovl_setup is not intended to work with the value OMAP_DSS_WB of the enum parameter plane. The value of this parameter is initialized in dss_init_overlays and in the current state of the code it cannot take this value so it's not a real problem. For the purposes of defensive coding it wouldn't be superfluous to check the parameter value, because some functions down the call stack process this value correctly and some not. For example, in dispc_ovl_setup_global_alpha it may lead to buffer overflow. Add check for this value. Found by Linux Verification Center (linuxtesting.org) with SVACE static analysis tool.
In the Linux kernel, the following vulnerability has been resolved: LoongArch: KVM: Fix stack protector issue in send_ipi_data() Function kvm_io_bus_read() is called in function send_ipi_data(), buffer size of parameter *val should be at least 8 bytes. Since some emulation functions like loongarch_ipi_readl() and kvm_eiointc_read() will write the buffer *val with 8 bytes signed extension regardless parameter len. Otherwise there will be buffer overflow issue when CONFIG_STACKPROTECTOR is enabled. The bug report is shown as follows: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: send_ipi_data+0x194/0x1a0 [kvm] CPU: 11 UID: 107 PID: 2692 Comm: CPU 0/KVM Not tainted 6.17.0-rc1+ #102 PREEMPT(full) Stack : 9000000005901568 0000000000000000 9000000003af371c 900000013c68c000 900000013c68f850 900000013c68f858 0000000000000000 900000013c68f998 900000013c68f990 900000013c68f990 900000013c68f6c0 fffffffffffdb058 fffffffffffdb0e0 900000013c68f858 911e1d4d39cf0ec2 9000000105657a00 0000000000000001 fffffffffffffffe 0000000000000578 282049464555206e 6f73676e6f6f4c20 0000000000000001 00000000086b4000 0000000000000000 0000000000000000 0000000000000000 9000000005709968 90000000058f9000 900000013c68fa68 900000013c68fab4 90000000029279f0 900000010153f940 900000010001f360 0000000000000000 9000000003af3734 000000004390000c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000003af3734>] show_stack+0x5c/0x180 [<9000000003aed168>] dump_stack_lvl+0x6c/0x9c [<9000000003ad0ab0>] vpanic+0x108/0x2c4 [<9000000003ad0ca8>] panic+0x3c/0x40 [<9000000004eb0a1c>] __stack_chk_fail+0x14/0x18 [<ffff8000023473f8>] send_ipi_data+0x190/0x1a0 [kvm] [<ffff8000023313e4>] __kvm_io_bus_write+0xa4/0xe8 [kvm] [<ffff80000233147c>] kvm_io_bus_write+0x54/0x90 [kvm] [<ffff80000233f9f8>] kvm_emu_iocsr+0x180/0x310 [kvm] [<ffff80000233fe08>] kvm_handle_gspr+0x280/0x478 [kvm] [<ffff8000023443e8>] kvm_handle_exit+0xc0/0x130 [kvm]
In the Linux kernel, the following vulnerability has been resolved: RDMA/hfi1: Fix potential integer multiplication overflow errors When multiplying of different types, an overflow is possible even when storing the result in a larger type. This is because the conversion is done after the multiplication. So arithmetic overflow and thus in incorrect value is possible. Correct an instance of this in the inter packet delay calculation. Fix by ensuring one of the operands is u64 which will promote the other to u64 as well ensuring no overflow.
In the Linux kernel, the following vulnerability has been resolved: phy: ti: Fix missing sentinel for clk_div_table _get_table_maxdiv() tries to access "clk_div_table" array out of bound defined in phy-j721e-wiz.c. Add a sentinel entry to prevent the following global-out-of-bounds error reported by enabling KASAN. [ 9.552392] BUG: KASAN: global-out-of-bounds in _get_maxdiv+0xc0/0x148 [ 9.558948] Read of size 4 at addr ffff8000095b25a4 by task kworker/u4:1/38 [ 9.565926] [ 9.567441] CPU: 1 PID: 38 Comm: kworker/u4:1 Not tainted 5.16.0-116492-gdaadb3bd0e8d-dirty #360 [ 9.576242] Hardware name: Texas Instruments J721e EVM (DT) [ 9.581832] Workqueue: events_unbound deferred_probe_work_func [ 9.587708] Call trace: [ 9.590174] dump_backtrace+0x20c/0x218 [ 9.594038] show_stack+0x18/0x68 [ 9.597375] dump_stack_lvl+0x9c/0xd8 [ 9.601062] print_address_description.constprop.0+0x78/0x334 [ 9.606830] kasan_report+0x1f0/0x260 [ 9.610517] __asan_load4+0x9c/0xd8 [ 9.614030] _get_maxdiv+0xc0/0x148 [ 9.617540] divider_determine_rate+0x88/0x488 [ 9.622005] divider_round_rate_parent+0xc8/0x124 [ 9.626729] wiz_clk_div_round_rate+0x54/0x68 [ 9.631113] clk_core_determine_round_nolock+0x124/0x158 [ 9.636448] clk_core_round_rate_nolock+0x68/0x138 [ 9.641260] clk_core_set_rate_nolock+0x268/0x3a8 [ 9.645987] clk_set_rate+0x50/0xa8 [ 9.649499] cdns_sierra_phy_init+0x88/0x248 [ 9.653794] phy_init+0x98/0x108 [ 9.657046] cdns_pcie_enable_phy+0xa0/0x170 [ 9.661340] cdns_pcie_init_phy+0x250/0x2b0 [ 9.665546] j721e_pcie_probe+0x4b8/0x798 [ 9.669579] platform_probe+0x8c/0x108 [ 9.673350] really_probe+0x114/0x630 [ 9.677037] __driver_probe_device+0x18c/0x220 [ 9.681505] driver_probe_device+0xac/0x150 [ 9.685712] __device_attach_driver+0xec/0x170 [ 9.690178] bus_for_each_drv+0xf0/0x158 [ 9.694124] __device_attach+0x184/0x210 [ 9.698070] device_initial_probe+0x14/0x20 [ 9.702277] bus_probe_device+0xec/0x100 [ 9.706223] deferred_probe_work_func+0x124/0x180 [ 9.710951] process_one_work+0x4b0/0xbc0 [ 9.714983] worker_thread+0x74/0x5d0 [ 9.718668] kthread+0x214/0x230 [ 9.721919] ret_from_fork+0x10/0x20 [ 9.725520] [ 9.727032] The buggy address belongs to the variable: [ 9.732183] clk_div_table+0x24/0x440
In the Linux kernel, the following vulnerability has been resolved: net: xfrm: unexport __init-annotated xfrm4_protocol_init() EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Hence, modules cannot use symbols annotated __init. The access to a freed symbol may end up with kernel panic. modpost used to detect it, but it has been broken for a decade. Recently, I fixed modpost so it started to warn it again, then this showed up in linux-next builds. There are two ways to fix it: - Remove __init - Remove EXPORT_SYMBOL I chose the latter for this case because the only in-tree call-site, net/ipv4/xfrm4_policy.c is never compiled as modular. (CONFIG_XFRM is boolean)
In the Linux kernel, the following vulnerability has been resolved: btrfs: skip reserved bytes warning on unmount after log cleanup failure After the recent changes made by commit c2e39305299f01 ("btrfs: clear extent buffer uptodate when we fail to write it") and its followup fix, commit 651740a5024117 ("btrfs: check WRITE_ERR when trying to read an extent buffer"), we can now end up not cleaning up space reservations of log tree extent buffers after a transaction abort happens, as well as not cleaning up still dirty extent buffers. This happens because if writeback for a log tree extent buffer failed, then we have cleared the bit EXTENT_BUFFER_UPTODATE from the extent buffer and we have also set the bit EXTENT_BUFFER_WRITE_ERR on it. Later on, when trying to free the log tree with free_log_tree(), which iterates over the tree, we can end up getting an -EIO error when trying to read a node or a leaf, since read_extent_buffer_pages() returns -EIO if an extent buffer does not have EXTENT_BUFFER_UPTODATE set and has the EXTENT_BUFFER_WRITE_ERR bit set. Getting that -EIO means that we return immediately as we can not iterate over the entire tree. In that case we never update the reserved space for an extent buffer in the respective block group and space_info object. When this happens we get the following traces when unmounting the fs: [174957.284509] BTRFS: error (device dm-0) in cleanup_transaction:1913: errno=-5 IO failure [174957.286497] BTRFS: error (device dm-0) in free_log_tree:3420: errno=-5 IO failure [174957.399379] ------------[ cut here ]------------ [174957.402497] WARNING: CPU: 2 PID: 3206883 at fs/btrfs/block-group.c:127 btrfs_put_block_group+0x77/0xb0 [btrfs] [174957.407523] Modules linked in: btrfs overlay dm_zero (...) [174957.424917] CPU: 2 PID: 3206883 Comm: umount Tainted: G W 5.16.0-rc5-btrfs-next-109 #1 [174957.426689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [174957.428716] RIP: 0010:btrfs_put_block_group+0x77/0xb0 [btrfs] [174957.429717] Code: 21 48 8b bd (...) [174957.432867] RSP: 0018:ffffb70d41cffdd0 EFLAGS: 00010206 [174957.433632] RAX: 0000000000000001 RBX: ffff8b09c3848000 RCX: ffff8b0758edd1c8 [174957.434689] RDX: 0000000000000001 RSI: ffffffffc0b467e7 RDI: ffff8b0758edd000 [174957.436068] RBP: ffff8b0758edd000 R08: 0000000000000000 R09: 0000000000000000 [174957.437114] R10: 0000000000000246 R11: 0000000000000000 R12: ffff8b09c3848148 [174957.438140] R13: ffff8b09c3848198 R14: ffff8b0758edd188 R15: dead000000000100 [174957.439317] FS: 00007f328fb82800(0000) GS:ffff8b0a2d200000(0000) knlGS:0000000000000000 [174957.440402] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [174957.441164] CR2: 00007fff13563e98 CR3: 0000000404f4e005 CR4: 0000000000370ee0 [174957.442117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [174957.443076] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [174957.443948] Call Trace: [174957.444264] <TASK> [174957.444538] btrfs_free_block_groups+0x255/0x3c0 [btrfs] [174957.445238] close_ctree+0x301/0x357 [btrfs] [174957.445803] ? call_rcu+0x16c/0x290 [174957.446250] generic_shutdown_super+0x74/0x120 [174957.446832] kill_anon_super+0x14/0x30 [174957.447305] btrfs_kill_super+0x12/0x20 [btrfs] [174957.447890] deactivate_locked_super+0x31/0xa0 [174957.448440] cleanup_mnt+0x147/0x1c0 [174957.448888] task_work_run+0x5c/0xa0 [174957.449336] exit_to_user_mode_prepare+0x1e5/0x1f0 [174957.449934] syscall_exit_to_user_mode+0x16/0x40 [174957.450512] do_syscall_64+0x48/0xc0 [174957.450980] entry_SYSCALL_64_after_hwframe+0x44/0xae [174957.451605] RIP: 0033:0x7f328fdc4a97 [174957.452059] Code: 03 0c 00 f7 (...) [174957.454320] RSP: 002b:00007fff13564ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [174957.455262] RAX: 0000000000000000 RBX: 00007f328feea264 RCX: 00007f328fdc4a97 [174957.456131] RDX: 0000000000000000 RSI: 00000000000000 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: f2fs: remove WARN_ON in f2fs_is_valid_blkaddr Syzbot triggers two WARNs in f2fs_is_valid_blkaddr and __is_bitmap_valid. For example, in f2fs_is_valid_blkaddr, if type is DATA_GENERIC_ENHANCE or DATA_GENERIC_ENHANCE_READ, it invokes WARN_ON if blkaddr is not in the right range. The call trace is as follows: f2fs_get_node_info+0x45f/0x1070 read_node_page+0x577/0x1190 __get_node_page.part.0+0x9e/0x10e0 __get_node_page f2fs_get_node_page+0x109/0x180 do_read_inode f2fs_iget+0x2a5/0x58b0 f2fs_fill_super+0x3b39/0x7ca0 Fix these two WARNs by replacing WARN_ON with dump_stack.
In the Linux kernel, the following vulnerability has been resolved: clk: mediatek: Fix memory leaks on probe Handle the error branches to free memory where required. Addresses-Coverity-ID: 1491825 ("Resource leak")
In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Fix double list_add when enabling VMD in scalable mode When enabling VMD and IOMMU scalable mode, the following kernel panic call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids CPU) during booting: pci 0000:59:00.5: Adding to iommu group 42 ... vmd 0000:59:00.5: PCI host bridge to bus 10000:80 pci 10000:80:01.0: [8086:352a] type 01 class 0x060400 pci 10000:80:01.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit] pci 10000:80:01.0: enabling Extended Tags pci 10000:80:01.0: PME# supported from D0 D3hot D3cold pci 10000:80:01.0: DMAR: Setup RID2PASID failed pci 10000:80:01.0: Failed to add to iommu group 42: -16 pci 10000:80:03.0: [8086:352b] type 01 class 0x060400 pci 10000:80:03.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit] pci 10000:80:03.0: enabling Extended Tags pci 10000:80:03.0: PME# supported from D0 D3hot D3cold ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:29! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7 Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 01/13/2022 Workqueue: events work_for_cpu_fn RIP: 0010:__list_add_valid.cold+0x26/0x3f Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f 0b 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1 fe ff <0f> 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 98 0c d9 9e e8 8b b1 fe RSP: 0000:ff5ad434865b3a40 EFLAGS: 00010246 RAX: 0000000000000058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8 RDX: 0000000000000000 RSI: 00000000fffeffff RDI: ffffffff9fd34f20 RBP: ff4d611d8e245c00 R08: 0000000000000000 R09: ff5ad434865b3888 R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0 R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70 FS: 0000000000000000(0000) GS:ff4d611d5ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ff4d611fa1401000 CR3: 0000000aa0210001 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> intel_pasid_alloc_table+0x9c/0x1d0 dmar_insert_one_dev_info+0x423/0x540 ? device_to_iommu+0x12d/0x2f0 intel_iommu_attach_device+0x116/0x290 __iommu_attach_device+0x1a/0x90 iommu_group_add_device+0x190/0x2c0 __iommu_probe_device+0x13e/0x250 iommu_probe_device+0x24/0x150 iommu_bus_notifier+0x69/0x90 blocking_notifier_call_chain+0x5a/0x80 device_add+0x3db/0x7b0 ? arch_memremap_can_ram_remap+0x19/0x50 ? memremap+0x75/0x140 pci_device_add+0x193/0x1d0 pci_scan_single_device+0xb9/0xf0 pci_scan_slot+0x4c/0x110 pci_scan_child_bus_extend+0x3a/0x290 vmd_enable_domain.constprop.0+0x63e/0x820 vmd_probe+0x163/0x190 local_pci_probe+0x42/0x80 work_for_cpu_fn+0x13/0x20 process_one_work+0x1e2/0x3b0 worker_thread+0x1c4/0x3a0 ? rescuer_thread+0x370/0x370 kthread+0xc7/0xf0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- ... Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- The following 'lspci' output shows devices '10000:80:*' are subdevices of the VMD device 0000:59:00.5: $ lspci ... 0000:59:00.5 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller (rev 20) ... 10000:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03) 10000:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03) 10000:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03) 10000:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03) 10000:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] 10000:82:00 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: soc: brcmstb: pm-arm: Fix refcount leak and __iomem leak bugs In brcmstb_pm_probe(), there are two kinds of leak bugs: (1) we need to add of_node_put() when for_each__matching_node() breaks (2) we need to add iounmap() for each iomap in fail path
In the Linux kernel, the following vulnerability has been resolved: remoteproc: qcom_q6v5_mss: Fix some leaks in q6v5_alloc_memory_region The device_node pointer is returned by of_parse_phandle() or of_get_child_by_name() with refcount incremented. We should use of_node_put() on it when done. This function only call of_node_put(node) when of_address_to_resource succeeds, missing error cases.
In the Linux kernel, the following vulnerability has been resolved: mt76: mt7915: fix possible NULL pointer dereference in mt7915_mac_fill_rx_vector Fix possible NULL pointer dereference in mt7915_mac_fill_rx_vector routine if the chip does not support dbdc and the hw reports band_idx set to 1.
In the Linux kernel, the following vulnerability has been resolved: media: venus: hfi: add a check to handle OOB in sfr region sfr->buf_size is in shared memory and can be modified by malicious user. OOB write is possible when the size is made higher than actual sfr data buffer. Cap the size to allocated size for such cases.
In the Linux kernel, the following vulnerability has been resolved: ASoC: samsung: Fix refcount leak in aries_audio_probe of_parse_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. If extcon_find_edev_by_node() fails, it doesn't call of_node_put() Calling of_node_put() after extcon_find_edev_by_node() to fix this.
In the Linux kernel, the following vulnerability has been resolved: net: remove two BUG() from skb_checksum_help() I have a syzbot report that managed to get a crash in skb_checksum_help() If syzbot can trigger these BUG(), it makes sense to replace them with more friendly WARN_ON_ONCE() since skb_checksum_help() can instead return an error code. Note that syzbot will still crash there, until real bug is fixed.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix relocation crash due to premature return from btrfs_commit_transaction() We are seeing crashes similar to the following trace: [38.969182] WARNING: CPU: 20 PID: 2105 at fs/btrfs/relocation.c:4070 btrfs_relocate_block_group+0x2dc/0x340 [btrfs] [38.973556] CPU: 20 PID: 2105 Comm: btrfs Not tainted 5.17.0-rc4 #54 [38.974580] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [38.976539] RIP: 0010:btrfs_relocate_block_group+0x2dc/0x340 [btrfs] [38.980336] RSP: 0000:ffffb0dd42e03c20 EFLAGS: 00010206 [38.981218] RAX: ffff96cfc4ede800 RBX: ffff96cfc3ce0000 RCX: 000000000002ca14 [38.982560] RDX: 0000000000000000 RSI: 4cfd109a0bcb5d7f RDI: ffff96cfc3ce0360 [38.983619] RBP: ffff96cfc309c000 R08: 0000000000000000 R09: 0000000000000000 [38.984678] R10: ffff96cec0000001 R11: ffffe84c80000000 R12: ffff96cfc4ede800 [38.985735] R13: 0000000000000000 R14: 0000000000000000 R15: ffff96cfc3ce0360 [38.987146] FS: 00007f11c15218c0(0000) GS:ffff96d6dfb00000(0000) knlGS:0000000000000000 [38.988662] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [38.989398] CR2: 00007ffc922c8e60 CR3: 00000001147a6001 CR4: 0000000000370ee0 [38.990279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [38.991219] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [38.992528] Call Trace: [38.992854] <TASK> [38.993148] btrfs_relocate_chunk+0x27/0xe0 [btrfs] [38.993941] btrfs_balance+0x78e/0xea0 [btrfs] [38.994801] ? vsnprintf+0x33c/0x520 [38.995368] ? __kmalloc_track_caller+0x351/0x440 [38.996198] btrfs_ioctl_balance+0x2b9/0x3a0 [btrfs] [38.997084] btrfs_ioctl+0x11b0/0x2da0 [btrfs] [38.997867] ? mod_objcg_state+0xee/0x340 [38.998552] ? seq_release+0x24/0x30 [38.999184] ? proc_nr_files+0x30/0x30 [38.999654] ? call_rcu+0xc8/0x2f0 [39.000228] ? __x64_sys_ioctl+0x84/0xc0 [39.000872] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [39.001973] __x64_sys_ioctl+0x84/0xc0 [39.002566] do_syscall_64+0x3a/0x80 [39.003011] entry_SYSCALL_64_after_hwframe+0x44/0xae [39.003735] RIP: 0033:0x7f11c166959b [39.007324] RSP: 002b:00007fff2543e998 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [39.008521] RAX: ffffffffffffffda RBX: 00007f11c1521698 RCX: 00007f11c166959b [39.009833] RDX: 00007fff2543ea40 RSI: 00000000c4009420 RDI: 0000000000000003 [39.011270] RBP: 0000000000000003 R08: 0000000000000013 R09: 00007f11c16f94e0 [39.012581] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff25440df3 [39.014046] R13: 0000000000000000 R14: 00007fff2543ea40 R15: 0000000000000001 [39.015040] </TASK> [39.015418] ---[ end trace 0000000000000000 ]--- [43.131559] ------------[ cut here ]------------ [43.132234] kernel BUG at fs/btrfs/extent-tree.c:2717! [43.133031] invalid opcode: 0000 [#1] PREEMPT SMP PTI [43.133702] CPU: 1 PID: 1839 Comm: btrfs Tainted: G W 5.17.0-rc4 #54 [43.134863] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [43.136426] RIP: 0010:unpin_extent_range+0x37a/0x4f0 [btrfs] [43.139913] RSP: 0000:ffffb0dd4216bc70 EFLAGS: 00010246 [43.140629] RAX: 0000000000000000 RBX: ffff96cfc34490f8 RCX: 0000000000000001 [43.141604] RDX: 0000000080000001 RSI: 0000000051d00000 RDI: 00000000ffffffff [43.142645] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff96cfd07dca50 [43.143669] R10: ffff96cfc46e8a00 R11: fffffffffffec000 R12: 0000000041d00000 [43.144657] R13: ffff96cfc3ce0000 R14: ffffb0dd4216bd08 R15: 0000000000000000 [43.145686] FS: 00007f7657dd68c0(0000) GS:ffff96d6df640000(0000) knlGS:0000000000000000 [43.146808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [43.147584] CR2: 00007f7fe81bf5b0 CR3: 00000001093ee004 CR4: 0000000000370ee0 [43.148589] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [43.149581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put() The bnx2fc_destroy() functions are removing the interface before calling destroy_work. This results multiple WARNings from sysfs_remove_group() as the controller rport device attributes are removed too early. Replace the fcoe_port's destroy_work queue. It's not needed. The problem is easily reproducible with the following steps. Example: $ dmesg -w & $ systemctl enable --now fcoe $ fipvlan -s -c ens2f1 $ fcoeadm -d ens2f1.802 [ 583.464488] host2: libfc: Link down on port (7500a1) [ 583.472651] bnx2fc: 7500a1 - rport not created Yet!! [ 583.490468] ------------[ cut here ]------------ [ 583.538725] sysfs group 'power' not found for kobject 'rport-2:0-0' [ 583.568814] WARNING: CPU: 3 PID: 192 at fs/sysfs/group.c:279 sysfs_remove_group+0x6f/0x80 [ 583.607130] Modules linked in: dm_service_time 8021q garp mrp stp llc bnx2fc cnic uio rpcsec_gss_krb5 auth_rpcgss nfsv4 ... [ 583.942994] CPU: 3 PID: 192 Comm: kworker/3:2 Kdump: loaded Not tainted 5.14.0-39.el9.x86_64 #1 [ 583.984105] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 584.016535] Workqueue: fc_wq_2 fc_rport_final_delete [scsi_transport_fc] [ 584.050691] RIP: 0010:sysfs_remove_group+0x6f/0x80 [ 584.074725] Code: ff 5b 48 89 ef 5d 41 5c e9 ee c0 ff ff 48 89 ef e8 f6 b8 ff ff eb d1 49 8b 14 24 48 8b 33 48 c7 c7 ... [ 584.162586] RSP: 0018:ffffb567c15afdc0 EFLAGS: 00010282 [ 584.188225] RAX: 0000000000000000 RBX: ffffffff8eec4220 RCX: 0000000000000000 [ 584.221053] RDX: ffff8c1586ce84c0 RSI: ffff8c1586cd7cc0 RDI: ffff8c1586cd7cc0 [ 584.255089] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb567c15afc00 [ 584.287954] R10: ffffb567c15afbf8 R11: ffffffff8fbe7f28 R12: ffff8c1486326400 [ 584.322356] R13: ffff8c1486326480 R14: ffff8c1483a4a000 R15: 0000000000000004 [ 584.355379] FS: 0000000000000000(0000) GS:ffff8c1586cc0000(0000) knlGS:0000000000000000 [ 584.394419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 584.421123] CR2: 00007fe95a6f7840 CR3: 0000000107674002 CR4: 00000000000606e0 [ 584.454888] Call Trace: [ 584.466108] device_del+0xb2/0x3e0 [ 584.481701] device_unregister+0x13/0x60 [ 584.501306] bsg_unregister_queue+0x5b/0x80 [ 584.522029] bsg_remove_queue+0x1c/0x40 [ 584.541884] fc_rport_final_delete+0xf3/0x1d0 [scsi_transport_fc] [ 584.573823] process_one_work+0x1e3/0x3b0 [ 584.592396] worker_thread+0x50/0x3b0 [ 584.609256] ? rescuer_thread+0x370/0x370 [ 584.628877] kthread+0x149/0x170 [ 584.643673] ? set_kthread_struct+0x40/0x40 [ 584.662909] ret_from_fork+0x22/0x30 [ 584.680002] ---[ end trace 53575ecefa942ece ]---
In the Linux kernel, the following vulnerability has been resolved: x86/mce: Work around an erratum on fast string copy instructions A rare kernel panic scenario can happen when the following conditions are met due to an erratum on fast string copy instructions: 1) An uncorrected error. 2) That error must be in first cache line of a page. 3) Kernel must execute page_copy from the page immediately before that page. The fast string copy instructions ("REP; MOVS*") could consume an uncorrectable memory error in the cache line _right after_ the desired region to copy and raise an MCE. Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string copy and will avoid such spurious machine checks. However, that is less preferable due to the permanent performance impact. Considering memory poison is rare, it's desirable to keep fast string copy enabled until an MCE is seen. Intel has confirmed the following: 1. The CPU erratum of fast string copy only applies to Skylake, Cascade Lake and Cooper Lake generations. Directly return from the MCE handler: 2. Will result in complete execution of the "REP; MOVS*" with no data loss or corruption. 3. Will not result in another MCE firing on the next poisoned cache line due to "REP; MOVS*". 4. Will resume execution from a correct point in code. 5. Will result in the same instruction that triggered the MCE firing a second MCE immediately for any other software recoverable data fetch errors. 6. Is not safe without disabling the fast string copy, as the next fast string copy of the same buffer on the same CPU would result in a PANIC MCE. This should mitigate the erratum completely with the only caveat that the fast string copy is disabled on the affected hyper thread thus performance degradation. This is still better than the OS crashing on MCEs raised on an irrelevant process due to "REP; MOVS*' accesses in a kernel context, e.g., copy_page. Injected errors on 1st cache line of 8 anonymous pages of process 'proc1' and observed MCE consumption from 'proc2' with no panic (directly returned). Without the fix, the host panicked within a few minutes on a random 'proc2' process due to kernel access from copy_page. [ bp: Fix comment style + touch ups, zap an unlikely(), improve the quirk function's readability. ]
In the Linux kernel, the following vulnerability has been resolved: jffs2: fix memory leak in jffs2_do_fill_super If jffs2_iget() or d_make_root() in jffs2_do_fill_super() returns an error, we can observe the following kmemleak report: -------------------------------------------- unreferenced object 0xffff888105a65340 (size 64): comm "mount", pid 710, jiffies 4302851558 (age 58.239s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff859c45e5>] kmem_cache_alloc_trace+0x475/0x8a0 [<ffffffff86160146>] jffs2_sum_init+0x96/0x1a0 [<ffffffff86140e25>] jffs2_do_mount_fs+0x745/0x2120 [<ffffffff86149fec>] jffs2_do_fill_super+0x35c/0x810 [<ffffffff8614aae9>] jffs2_fill_super+0x2b9/0x3b0 [...] unreferenced object 0xffff8881bd7f0000 (size 65536): comm "mount", pid 710, jiffies 4302851558 (age 58.239s) hex dump (first 32 bytes): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ backtrace: [<ffffffff858579ba>] kmalloc_order+0xda/0x110 [<ffffffff85857a11>] kmalloc_order_trace+0x21/0x130 [<ffffffff859c2ed1>] __kmalloc+0x711/0x8a0 [<ffffffff86160189>] jffs2_sum_init+0xd9/0x1a0 [<ffffffff86140e25>] jffs2_do_mount_fs+0x745/0x2120 [<ffffffff86149fec>] jffs2_do_fill_super+0x35c/0x810 [<ffffffff8614aae9>] jffs2_fill_super+0x2b9/0x3b0 [...] -------------------------------------------- This is because the resources allocated in jffs2_sum_init() are not released. Call jffs2_sum_exit() to release these resources to solve the problem.
In the Linux kernel, the following vulnerability has been resolved: ath11k: mhi: use mhi_sync_power_up() If amss.bin was missing ath11k would crash during 'rmmod ath11k_pci'. The reason for that was that we were using mhi_async_power_up() which does not check any errors. But mhi_sync_power_up() on the other hand does check for errors so let's use that to fix the crash. I was not able to find a reason why an async version was used. ath11k_mhi_start() (which enables state ATH11K_MHI_POWER_ON) is called from ath11k_hif_power_up(), which can sleep. So sync version should be safe to use here. [ 145.569731] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI [ 145.569789] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 145.569843] CPU: 2 PID: 1628 Comm: rmmod Kdump: loaded Tainted: G W 5.16.0-wt-ath+ #567 [ 145.569898] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 [ 145.569956] RIP: 0010:ath11k_hal_srng_access_begin+0xb5/0x2b0 [ath11k] [ 145.570028] Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 ec 01 00 00 48 8b ab a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 ea 48 c1 ea 03 <0f> b6 14 02 48 89 e8 83 e0 07 83 c0 03 45 85 ed 75 48 38 d0 7c 08 [ 145.570089] RSP: 0018:ffffc900025d7ac0 EFLAGS: 00010246 [ 145.570144] RAX: dffffc0000000000 RBX: ffff88814fca2dd8 RCX: 1ffffffff50cb455 [ 145.570196] RDX: 0000000000000000 RSI: ffff88814fca2dd8 RDI: ffff88814fca2e80 [ 145.570252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffa8659497 [ 145.570329] R10: fffffbfff50cb292 R11: 0000000000000001 R12: ffff88814fca0000 [ 145.570410] R13: 0000000000000000 R14: ffff88814fca2798 R15: ffff88814fca2dd8 [ 145.570465] FS: 00007fa399988540(0000) GS:ffff888233e00000(0000) knlGS:0000000000000000 [ 145.570519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 145.570571] CR2: 00007fa399b51421 CR3: 0000000137898002 CR4: 00000000003706e0 [ 145.570623] Call Trace: [ 145.570675] <TASK> [ 145.570727] ? ath11k_ce_tx_process_cb+0x34b/0x860 [ath11k] [ 145.570797] ath11k_ce_tx_process_cb+0x356/0x860 [ath11k] [ 145.570864] ? tasklet_init+0x150/0x150 [ 145.570919] ? ath11k_ce_alloc_pipes+0x280/0x280 [ath11k] [ 145.570986] ? tasklet_clear_sched+0x42/0xe0 [ 145.571042] ? tasklet_kill+0xe9/0x1b0 [ 145.571095] ? tasklet_clear_sched+0xe0/0xe0 [ 145.571148] ? irq_has_action+0x120/0x120 [ 145.571202] ath11k_ce_cleanup_pipes+0x45a/0x580 [ath11k] [ 145.571270] ? ath11k_pci_stop+0x10e/0x170 [ath11k_pci] [ 145.571345] ath11k_core_stop+0x8a/0xc0 [ath11k] [ 145.571434] ath11k_core_deinit+0x9e/0x150 [ath11k] [ 145.571499] ath11k_pci_remove+0xd2/0x260 [ath11k_pci] [ 145.571553] pci_device_remove+0x9a/0x1c0 [ 145.571605] __device_release_driver+0x332/0x660 [ 145.571659] driver_detach+0x1e7/0x2c0 [ 145.571712] bus_remove_driver+0xe2/0x2d0 [ 145.571772] pci_unregister_driver+0x21/0x250 [ 145.571826] __do_sys_delete_module+0x30a/0x4b0 [ 145.571879] ? free_module+0xac0/0xac0 [ 145.571933] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 [ 145.571986] ? syscall_enter_from_user_mode+0x1d/0x50 [ 145.572039] ? lockdep_hardirqs_on+0x79/0x100 [ 145.572097] do_syscall_64+0x3b/0x90 [ 145.572153] entry_SYSCALL_64_after_hwframe+0x44/0xae Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03003-QCAHSPSWPL_V1_V2_SILICONZ_LITE-2
In the Linux kernel, the following vulnerability has been resolved: powerpc/rtas: Keep MSR[RI] set when calling RTAS RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big endian mode (MSR[SF,LE] unset). The change in MSR is done in enter_rtas() in a relatively complex way, since the MSR value could be hardcoded. Furthermore, a panic has been reported when hitting the watchdog interrupt while running in RTAS, this leads to the following stack trace: watchdog: CPU 24 Hard LOCKUP watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago) ... Supported: No, Unreleased kernel CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000 REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default) MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020 CFAR: 000000000000011c IRQMASK: 1 GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010 GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034 GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008 GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40 GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000 NIP [000000001fb41050] 0x1fb41050 LR [000000001fb4104c] 0x1fb4104c Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Oops: Unrecoverable System Reset, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ... Supported: No, Unreleased kernel CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000 REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default) MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020 CFAR: 000000000000011c IRQMASK: 1 GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010 GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034 GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008 GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40 GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000 NIP [000000001fb41050] 0x1fb41050 LR [000000001fb4104c] 0x1fb4104c Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 3ddec07f638c34a2 ]--- This happens because MSR[RI] is unset when entering RTAS but there is no valid reason to not set it here. RTAS is expected to be called with MSR[RI] as specified in PAPR+ section "7.2.1 Machine State": R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its own critical regions from recursion by setting the MSR[RI] bit to 0 when in the critical regions. Fixing this by reviewing the way MSR is compute before calling RTAS. Now a hardcoded value meaning real ---truncated---
In the Linux kernel, the following vulnerability has been resolved: bpf, sockmap: Fix more uncharged while msg has more_data In tcp_bpf_send_verdict(), if msg has more data after tcp_bpf_sendmsg_redir(): tcp_bpf_send_verdict() tosend = msg->sg.size //msg->sg.size = 22220 case __SK_REDIRECT: sk_msg_return() //uncharged msg->sg.size(22220) sk->sk_forward_alloc tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000 goto more_data; tosend = msg->sg.size //msg->sg.size = 11000 case __SK_REDIRECT: sk_msg_return() //uncharged msg->sg.size(11000) to sk->sk_forward_alloc The msg->sg.size(11000) has been uncharged twice, to fix we can charge the remaining msg->sg.size before goto more data. This issue can cause the following info: WARNING: CPU: 0 PID: 9860 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0 Call Trace: <TASK> inet_csk_destroy_sock+0x55/0x110 __tcp_close+0x279/0x470 tcp_close+0x1f/0x60 inet_release+0x3f/0x80 __sock_release+0x3d/0xb0 sock_close+0x11/0x20 __fput+0x92/0x250 task_work_run+0x6a/0xa0 do_exit+0x33b/0xb60 do_group_exit+0x2f/0xa0 get_signal+0xb6/0x950 arch_do_signal_or_restart+0xac/0x2a0 ? vfs_write+0x237/0x290 exit_to_user_mode_prepare+0xa9/0x200 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x46/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260 Call Trace: <TASK> __sk_destruct+0x24/0x1f0 sk_psock_destroy+0x19b/0x1c0 process_one_work+0x1b3/0x3c0 worker_thread+0x30/0x350 ? process_one_work+0x3c0/0x3c0 kthread+0xe6/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 </TASK>
In the Linux kernel, the following vulnerability has been resolved: cifs: prevent bad output lengths in smb2_ioctl_query_info() When calling smb2_ioctl_query_info() with smb_query_info::flags=PASSTHRU_FSCTL and smb_query_info::output_buffer_length=0, the following would return 0x10 buffer = memdup_user(arg + sizeof(struct smb_query_info), qi.output_buffer_length); if (IS_ERR(buffer)) { kfree(vars); return PTR_ERR(buffer); } rather than a valid pointer thus making IS_ERR() check fail. This would then cause a NULL ptr deference in @buffer when accessing it later in smb2_ioctl_query_ioctl(). While at it, prevent having a @buffer smaller than 8 bytes to correctly handle SMB2_SET_INFO FileEndOfFileInformation requests when smb_query_info::flags=PASSTHRU_SET_INFO. Here is a small C reproducer which triggers a NULL ptr in @buffer when passing an invalid smb_query_info::flags #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <unistd.h> #include <fcntl.h> #include <sys/ioctl.h> #define die(s) perror(s), exit(1) #define QUERY_INFO 0xc018cf07 int main(int argc, char *argv[]) { int fd; if (argc < 2) exit(1); fd = open(argv[1], O_RDONLY); if (fd == -1) die("open"); if (ioctl(fd, QUERY_INFO, (uint32_t[]) { 0, 0, 0, 4, 0, 0}) == -1) die("ioctl"); close(fd); return 0; } mount.cifs //srv/share /mnt -o ... gcc repro.c && ./a.out /mnt/f0 [ 114.138620] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 114.139310] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 114.139775] CPU: 2 PID: 995 Comm: a.out Not tainted 5.17.0-rc8 #1 [ 114.140148] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 114.140818] RIP: 0010:smb2_ioctl_query_info+0x206/0x410 [cifs] [ 114.141221] Code: 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 c8 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 7b 28 4c 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9c 01 00 00 49 8b 3f e8 58 02 fb ff 48 8b 14 24 [ 114.142348] RSP: 0018:ffffc90000b47b00 EFLAGS: 00010256 [ 114.142692] RAX: dffffc0000000000 RBX: ffff888115503200 RCX: ffffffffa020580d [ 114.143119] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffffa043a380 [ 114.143544] RBP: ffff888115503278 R08: 0000000000000001 R09: 0000000000000003 [ 114.143983] R10: fffffbfff4087470 R11: 0000000000000001 R12: ffff888115503288 [ 114.144424] R13: 00000000ffffffea R14: ffff888115503228 R15: 0000000000000000 [ 114.144852] FS: 00007f7aeabdf740(0000) GS:ffff888151600000(0000) knlGS:0000000000000000 [ 114.145338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 114.145692] CR2: 00007f7aeacfdf5e CR3: 000000012000e000 CR4: 0000000000350ee0 [ 114.146131] Call Trace: [ 114.146291] <TASK> [ 114.146432] ? smb2_query_reparse_tag+0x890/0x890 [cifs] [ 114.146800] ? cifs_mapchar+0x460/0x460 [cifs] [ 114.147121] ? rcu_read_lock_sched_held+0x3f/0x70 [ 114.147412] ? cifs_strndup_to_utf16+0x15b/0x250 [cifs] [ 114.147775] ? dentry_path_raw+0xa6/0xf0 [ 114.148024] ? cifs_convert_path_to_utf16+0x198/0x220 [cifs] [ 114.148413] ? smb2_check_message+0x1080/0x1080 [cifs] [ 114.148766] ? rcu_read_lock_sched_held+0x3f/0x70 [ 114.149065] cifs_ioctl+0x1577/0x3320 [cifs] [ 114.149371] ? lock_downgrade+0x6f0/0x6f0 [ 114.149631] ? cifs_readdir+0x2e60/0x2e60 [cifs] [ 114.149956] ? rcu_read_lock_sched_held+0x3f/0x70 [ 114.150250] ? __rseq_handle_notify_resume+0x80b/0xbe0 [ 114.150562] ? __up_read+0x192/0x710 [ 114.150791] ? __ia32_sys_rseq+0xf0/0xf0 [ 114.151025] ? __x64_sys_openat+0x11f/0x1d0 [ 114.151296] __x64_sys_ioctl+0x127/0x190 [ 114.151549] do_syscall_64+0x3b/0x90 [ 114.151768] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 114.152079] RIP: 0033:0x7f7aead043df [ 114.152306] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: scsi: qla2xxx: Fix crash during module load unload test During purex packet handling the driver was incorrectly freeing a pre-allocated structure. Fix this by skipping that entry. System crashed with the following stack during a module unload test. Call Trace: sbitmap_init_node+0x7f/0x1e0 sbitmap_queue_init_node+0x24/0x150 blk_mq_init_bitmaps+0x3d/0xa0 blk_mq_init_tags+0x68/0x90 blk_mq_alloc_map_and_rqs+0x44/0x120 blk_mq_alloc_set_map_and_rqs+0x63/0x150 blk_mq_alloc_tag_set+0x11b/0x230 scsi_add_host_with_dma.cold+0x3f/0x245 qla2x00_probe_one+0xd5a/0x1b80 [qla2xxx] Call Trace with slub_debug and debug kernel: kasan_report_invalid_free+0x50/0x80 __kasan_slab_free+0x137/0x150 slab_free_freelist_hook+0xc6/0x190 kfree+0xe8/0x2e0 qla2x00_free_device+0x3bb/0x5d0 [qla2xxx] qla2x00_remove_one+0x668/0xcf0 [qla2xxx]
In the Linux kernel, the following vulnerability has been resolved: rtl818x: Prevent using not initialized queues Using not existing queues can panic the kernel with rtl8180/rtl8185 cards. Ignore the skb priority for those cards, they only have one tx queue. Pierre Asselin (pa@panix.com) reported the kernel crash in the Gentoo forum: https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start-25.html He also confirmed that this patch fixes the issue. In summary this happened: After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a "divide error: 0000" when connecting to an AP. Control port tx now tries to use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in 2.10. Since only the rtl8187se part of the driver supports QoS, the priority of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185 cards. rtl8180 is then unconditionally reading out the priority and finally crashes on drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this patch: idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries "ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got initialized.
In the Linux kernel, the following vulnerability has been resolved: PM / devfreq: rk3399_dmc: Disable edev on remove() Otherwise we hit an unablanced enable-count when unbinding the DFI device: [ 1279.659119] ------------[ cut here ]------------ [ 1279.659179] WARNING: CPU: 2 PID: 5638 at drivers/devfreq/devfreq-event.c:360 devfreq_event_remove_edev+0x84/0x8c ... [ 1279.659352] Hardware name: Google Kevin (DT) [ 1279.659363] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 1279.659371] pc : devfreq_event_remove_edev+0x84/0x8c [ 1279.659380] lr : devm_devfreq_event_release+0x1c/0x28 ... [ 1279.659571] Call trace: [ 1279.659582] devfreq_event_remove_edev+0x84/0x8c [ 1279.659590] devm_devfreq_event_release+0x1c/0x28 [ 1279.659602] release_nodes+0x1cc/0x244 [ 1279.659611] devres_release_all+0x44/0x60 [ 1279.659621] device_release_driver_internal+0x11c/0x1ac [ 1279.659629] device_driver_detach+0x20/0x2c [ 1279.659641] unbind_store+0x7c/0xb0 [ 1279.659650] drv_attr_store+0x2c/0x40 [ 1279.659663] sysfs_kf_write+0x44/0x58 [ 1279.659672] kernfs_fop_write_iter+0xf4/0x190 [ 1279.659684] vfs_write+0x2b0/0x2e4 [ 1279.659693] ksys_write+0x80/0xec [ 1279.659701] __arm64_sys_write+0x24/0x30 [ 1279.659714] el0_svc_common+0xf0/0x1d8 [ 1279.659724] do_el0_svc_compat+0x28/0x3c [ 1279.659738] el0_svc_compat+0x10/0x1c [ 1279.659746] el0_sync_compat_handler+0xa8/0xcc [ 1279.659758] el0_sync_compat+0x188/0x1c0 [ 1279.659768] ---[ end trace cec200e5094155b4 ]---
In the Linux kernel, the following vulnerability has been resolved: RISC-V: KVM: Teardown riscv specific bits after kvm_exit During a module removal, kvm_exit invokes arch specific disable call which disables AIA. However, we invoke aia_exit before kvm_exit resulting in the following warning. KVM kernel module can't be inserted afterwards due to inconsistent state of IRQ. [25469.031389] percpu IRQ 31 still enabled on CPU0! [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 [25469.031804] Modules linked in: kvm(-) [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 [25469.031905] Hardware name: riscv-virtio,qemu (DT) [25469.031928] epc : __free_percpu_irq+0xa2/0x150 [25469.031976] ra : __free_percpu_irq+0xa2/0x150 [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 [25469.032738] [<ffffffff8007db1e>] __free_percpu_irq+0xa2/0x150 [25469.032797] [<ffffffff8007dbfc>] free_percpu_irq+0x30/0x5e [25469.032856] [<ffffffff013a57dc>] kvm_riscv_aia_exit+0x40/0x42 [kvm] [25469.033947] [<ffffffff013b4e82>] cleanup_module+0x10/0x32 [kvm] [25469.035300] [<ffffffff8009b150>] __riscv_sys_delete_module+0x18e/0x1fc [25469.035374] [<ffffffff8000c1ca>] syscall_handler+0x3a/0x46 [25469.035456] [<ffffffff809ec9a4>] do_trap_ecall_u+0x72/0x134 [25469.035536] [<ffffffff809f5e18>] handle_exception+0x148/0x156 Invoke aia_exit and other arch specific cleanup functions after kvm_exit so that disable gets a chance to be called first before exit.
In the Linux kernel, the following vulnerability has been resolved: f2fs: fix to do sanity check on block address in f2fs_do_zero_range() As Yanming reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215894 I have encountered a bug in F2FS file system in kernel v5.17. I have uploaded the system call sequence as case.c, and a fuzzed image can be found in google net disk The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can reproduce the bug by running the following commands: kernel BUG at fs/f2fs/segment.c:2291! Call Trace: f2fs_invalidate_blocks+0x193/0x2d0 f2fs_fallocate+0x2593/0x4a70 vfs_fallocate+0x2a5/0xac0 ksys_fallocate+0x35/0x70 __x64_sys_fallocate+0x8e/0xf0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae The root cause is, after image was fuzzed, block mapping info in inode will be inconsistent with SIT table, so in f2fs_fallocate(), it will cause panic when updating SIT with invalid blkaddr. Let's fix the issue by adding sanity check on block address before updating SIT table with it.
A use-after-free vulnerability was found in the Linux kernel in drivers/net/hamradio. This flaw allows a local attacker with a user privilege to cause a denial of service (DOS) when the mkiss or sixpack device is detached and reclaim resources early.
In the Linux kernel, the following vulnerability has been resolved: ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that we are in the middle of replay the fast commit journal. This was actually a mistake, since the sbi->s_mount_info is initialized from es->s_state. Arguably s_mount_state is misleadingly named, but the name is historical --- s_mount_state and s_state dates back to ext2. What should have been used is the ext4_{set,clear,test}_mount_flag() inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags. The problem with using EXT4_FC_REPLAY is that a maliciously corrupted superblock could result in EXT4_FC_REPLAY getting set in s_mount_state. This bypasses some sanity checks, and this can trigger a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter out the EXT4_FC_REPLAY bit for now. We should eventually transition away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.
A use-after-free flaw was found in the Linux kernel’s Amateur Radio AX.25 protocol functionality in the way a user connects with the protocol. This flaw allows a local user to crash the system.
In the Linux kernel, the following vulnerability has been resolved: mac802154: fix missing INIT_LIST_HEAD in ieee802154_if_add() Kernel fault injection test reports null-ptr-deref as follows: BUG: kernel NULL pointer dereference, address: 0000000000000008 RIP: 0010:cfg802154_netdev_notifier_call+0x120/0x310 include/linux/list.h:114 Call Trace: <TASK> raw_notifier_call_chain+0x6d/0xa0 kernel/notifier.c:87 call_netdevice_notifiers_info+0x6e/0xc0 net/core/dev.c:1944 unregister_netdevice_many_notify+0x60d/0xcb0 net/core/dev.c:1982 unregister_netdevice_queue+0x154/0x1a0 net/core/dev.c:10879 register_netdevice+0x9a8/0xb90 net/core/dev.c:10083 ieee802154_if_add+0x6ed/0x7e0 net/mac802154/iface.c:659 ieee802154_register_hw+0x29c/0x330 net/mac802154/main.c:229 mcr20a_probe+0xaaa/0xcb1 drivers/net/ieee802154/mcr20a.c:1316 ieee802154_if_add() allocates wpan_dev as netdev's private data, but not init the list in struct wpan_dev. cfg802154_netdev_notifier_call() manage the list when device register/unregister, and may lead to null-ptr-deref. Use INIT_LIST_HEAD() on it to initialize it correctly.
In the Linux kernel, the following vulnerability has been resolved: scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe Running tests with a debug kernel shows that bnx2fc_recv_frame() is modifying the per_cpu lport stats counters in a non-mpsafe way. Just boot a debug kernel and run the bnx2fc driver with the hardware enabled. [ 1391.699147] BUG: using smp_processor_id() in preemptible [00000000] code: bnx2fc_ [ 1391.699160] caller is bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699174] CPU: 2 PID: 4355 Comm: bnx2fc_l2_threa Kdump: loaded Tainted: G B [ 1391.699180] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 1391.699183] Call Trace: [ 1391.699188] dump_stack_lvl+0x57/0x7d [ 1391.699198] check_preemption_disabled+0xc8/0xd0 [ 1391.699205] bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699215] ? do_raw_spin_trylock+0xb5/0x180 [ 1391.699221] ? bnx2fc_npiv_create_vports.isra.0+0x4e0/0x4e0 [bnx2fc] [ 1391.699229] ? bnx2fc_l2_rcv_thread+0xb7/0x3a0 [bnx2fc] [ 1391.699240] bnx2fc_l2_rcv_thread+0x1af/0x3a0 [bnx2fc] [ 1391.699250] ? bnx2fc_ulp_init+0xc0/0xc0 [bnx2fc] [ 1391.699258] kthread+0x364/0x420 [ 1391.699263] ? _raw_spin_unlock_irq+0x24/0x50 [ 1391.699268] ? set_kthread_struct+0x100/0x100 [ 1391.699273] ret_from_fork+0x22/0x30 Restore the old get_cpu/put_cpu code with some modifications to reduce the size of the critical section.
In the Linux kernel, the following vulnerability has been resolved: vsock: remove vsock from connected table when connect is interrupted by a signal vsock_connect() expects that the socket could already be in the TCP_ESTABLISHED state when the connecting task wakes up with a signal pending. If this happens the socket will be in the connected table, and it is not removed when the socket state is reset. In this situation it's common for the process to retry connect(), and if the connection is successful the socket will be added to the connected table a second time, corrupting the list. Prevent this by calling vsock_remove_connected() if a signal is received while waiting for a connection. This is harmless if the socket is not in the connected table, and if it is in the table then removing it will prevent list corruption from a double add. Note for backporting: this patch requires d5afa82c977e ("vsock: correct removal of socket from the list"), which is in all current stable trees except 4.9.y.