In the Linux kernel, the following vulnerability has been resolved: drm/mediatek: dp: drm_err => dev_err in HPD path to avoid NULL ptr The function mtk_dp_wait_hpd_asserted() may be called before the `mtk_dp->drm_dev` pointer is assigned in mtk_dp_bridge_attach(). Specifically it can be called via this callpath: - mtk_edp_wait_hpd_asserted - [panel probe] - dp_aux_ep_probe Using "drm" level prints anywhere in this callpath causes a NULL pointer dereference. Change the error message directly in mtk_dp_wait_hpd_asserted() to dev_err() to avoid this. Also change the error messages in mtk_dp_parse_capabilities(), which is called by mtk_dp_wait_hpd_asserted(). While touching these prints, also add the error code to them to make future debugging easier.
In the Linux kernel, the following vulnerability has been resolved: lib/group_cpus: fix NULL pointer dereference from group_cpus_evenly() While testing null_blk with configfs, echo 0 > poll_queues will trigger following panic: BUG: kernel NULL pointer dereference, address: 0000000000000010 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 27 UID: 0 PID: 920 Comm: bash Not tainted 6.15.0-02023-gadbdb95c8696-dirty #1238 PREEMPT(undef) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:__bitmap_or+0x48/0x70 Call Trace: <TASK> __group_cpus_evenly+0x822/0x8c0 group_cpus_evenly+0x2d9/0x490 blk_mq_map_queues+0x1e/0x110 null_map_queues+0xc9/0x170 [null_blk] blk_mq_update_queue_map+0xdb/0x160 blk_mq_update_nr_hw_queues+0x22b/0x560 nullb_update_nr_hw_queues+0x71/0xf0 [null_blk] nullb_device_poll_queues_store+0xa4/0x130 [null_blk] configfs_write_iter+0x109/0x1d0 vfs_write+0x26e/0x6f0 ksys_write+0x79/0x180 __x64_sys_write+0x1d/0x30 x64_sys_call+0x45c4/0x45f0 do_syscall_64+0xa5/0x240 entry_SYSCALL_64_after_hwframe+0x76/0x7e Root cause is that numgrps is set to 0, and ZERO_SIZE_PTR is returned from kcalloc(), and later ZERO_SIZE_PTR will be deferenced. Fix the problem by checking numgrps first in group_cpus_evenly(), and return NULL directly if numgrps is zero. [yukuai3@huawei.com: also fix the non-SMP version]
In the Linux kernel, the following vulnerability has been resolved: geneve: do not assume mac header is set in geneve_xmit_skb() We should not assume mac header is set in output path. Use skb_eth_hdr() instead of eth_hdr() to fix the issue. sysbot reported the following : WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 skb_mac_header include/linux/skbuff.h:3052 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 eth_hdr include/linux/if_ether.h:24 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit_skb drivers/net/geneve.c:898 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039 Modules linked in: CPU: 0 UID: 0 PID: 11635 Comm: syz.4.1423 Not tainted 6.12.0-syzkaller-10296-gaaf20f870da0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_mac_header include/linux/skbuff.h:3052 [inline] RIP: 0010:eth_hdr include/linux/if_ether.h:24 [inline] RIP: 0010:geneve_xmit_skb drivers/net/geneve.c:898 [inline] RIP: 0010:geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039 Code: 21 c6 02 e9 35 d4 ff ff e8 a5 48 4c fb 90 0f 0b 90 e9 fd f5 ff ff e8 97 48 4c fb 90 0f 0b 90 e9 d8 f5 ff ff e8 89 48 4c fb 90 <0f> 0b 90 e9 41 e4 ff ff e8 7b 48 4c fb 90 0f 0b 90 e9 cd e7 ff ff RSP: 0018:ffffc90003b2f870 EFLAGS: 00010283 RAX: 000000000000037a RBX: 000000000000ffff RCX: ffffc9000dc3d000 RDX: 0000000000080000 RSI: ffffffff86428417 RDI: 0000000000000003 RBP: ffffc90003b2f9f0 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000002 R12: ffff88806603c000 R13: 0000000000000000 R14: ffff8880685b2780 R15: 0000000000000e23 FS: 00007fdc2deed6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b30a1dff8 CR3: 0000000056b8c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] __dev_direct_xmit+0x58a/0x720 net/core/dev.c:4490 dev_direct_xmit include/linux/netdevice.h:3181 [inline] packet_xmit+0x1e4/0x360 net/packet/af_packet.c:285 packet_snd net/packet/af_packet.c:3146 [inline] packet_sendmsg+0x2700/0x5660 net/packet/af_packet.c:3178 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x488/0x4f0 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __x64_sys_sendto+0xe0/0x1c0 net/socket.c:2200 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
In the Linux kernel, the following vulnerability has been resolved: btrfs: check folio mapping after unlock in relocate_one_folio() When we call btrfs_read_folio() to bring a folio uptodate, we unlock the folio. The result of that is that a different thread can modify the mapping (like remove it with invalidate) before we call folio_lock(). This results in an invalid page and we need to try again. In particular, if we are relocating concurrently with aborting a transaction, this can result in a crash like the following: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 76 PID: 1411631 Comm: kworker/u322:5 Workqueue: events_unbound btrfs_reclaim_bgs_work RIP: 0010:set_page_extent_mapped+0x20/0xb0 RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246 RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880 RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0 R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000 R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff FS: 0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __die+0x78/0xc0 ? page_fault_oops+0x2a8/0x3a0 ? __switch_to+0x133/0x530 ? wq_worker_running+0xa/0x40 ? exc_page_fault+0x63/0x130 ? asm_exc_page_fault+0x22/0x30 ? set_page_extent_mapped+0x20/0xb0 relocate_file_extent_cluster+0x1a7/0x940 relocate_data_extent+0xaf/0x120 relocate_block_group+0x20f/0x480 btrfs_relocate_block_group+0x152/0x320 btrfs_relocate_chunk+0x3d/0x120 btrfs_reclaim_bgs_work+0x2ae/0x4e0 process_scheduled_works+0x184/0x370 worker_thread+0xc6/0x3e0 ? blk_add_timer+0xb0/0xb0 kthread+0xae/0xe0 ? flush_tlb_kernel_range+0x90/0x90 ret_from_fork+0x2f/0x40 ? flush_tlb_kernel_range+0x90/0x90 ret_from_fork_asm+0x11/0x20 </TASK> This occurs because cleanup_one_transaction() calls destroy_delalloc_inodes() which calls invalidate_inode_pages2() which takes the folio_lock before setting mapping to NULL. We fail to check this, and subsequently call set_extent_mapping(), which assumes that mapping != NULL (in fact it asserts that in debug mode) Note that the "fixes" patch here is not the one that introduced the race (the very first iteration of this code from 2009) but a more recent change that made this particular crash happen in practice.
The igmp_heard_query function in net/ipv4/igmp.c in the Linux kernel before 3.2.1 allows remote attackers to cause a denial of service (divide-by-zero error and panic) via IGMP packets.
In the Linux kernel, the following vulnerability has been resolved: tunnels: do not assume mac header is set in skb_tunnel_check_pmtu() Recently added debug in commit f9aefd6b2aa3 ("net: warn if mac header was not set") caught a bug in skb_tunnel_check_pmtu(), as shown in this syzbot report [1]. In ndo_start_xmit() paths, there is really no need to use skb->mac_header, because skb->data is supposed to point at it. [1] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_mac_header_len include/linux/skbuff.h:2784 [inline] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413 Modules linked in: CPU: 1 PID: 8604 Comm: syz-executor.3 Not tainted 5.19.0-rc2-syzkaller-00443-g8720bd951b8e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_mac_header_len include/linux/skbuff.h:2784 [inline] RIP: 0010:skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413 Code: 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 84 b9 fe ff ff 4c 89 ff e8 7c 0f d7 f9 e9 ac fe ff ff e8 c2 13 8a f9 <0f> 0b e9 28 fc ff ff e8 b6 13 8a f9 48 8b 54 24 70 48 b8 00 00 00 RSP: 0018:ffffc90002e4f520 EFLAGS: 00010212 RAX: 0000000000000324 RBX: ffff88804d5fd500 RCX: ffffc90005b52000 RDX: 0000000000040000 RSI: ffffffff87f05e3e RDI: 0000000000000003 RBP: ffffc90002e4f650 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000000 R12: 000000000000ffff R13: 0000000000000000 R14: 000000000000ffcd R15: 000000000000001f FS: 00007f3babba9700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000080 CR3: 0000000075319000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> geneve_xmit_skb drivers/net/geneve.c:927 [inline] geneve_xmit+0xcf8/0x35d0 drivers/net/geneve.c:1107 __netdev_start_xmit include/linux/netdevice.h:4805 [inline] netdev_start_xmit include/linux/netdevice.h:4819 [inline] __dev_direct_xmit+0x500/0x730 net/core/dev.c:4309 dev_direct_xmit include/linux/netdevice.h:3007 [inline] packet_direct_xmit+0x1b8/0x2c0 net/packet/af_packet.c:282 packet_snd net/packet/af_packet.c:3073 [inline] packet_sendmsg+0x21f4/0x55d0 net/packet/af_packet.c:3104 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2489 ___sys_sendmsg+0xf3/0x170 net/socket.c:2543 __sys_sendmsg net/socket.c:2572 [inline] __do_sys_sendmsg net/socket.c:2581 [inline] __se_sys_sendmsg net/socket.c:2579 [inline] __x64_sys_sendmsg+0x132/0x220 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f3baaa89109 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f3babba9168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f3baab9bf60 RCX: 00007f3baaa89109 RDX: 0000000000000000 RSI: 0000000020000a00 RDI: 0000000000000003 RBP: 00007f3baaae305d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe74f2543f R14: 00007f3babba9300 R15: 0000000000022000 </TASK>
In the Linux kernel, the following vulnerability has been resolved: x86/sev: Evict cache lines during SNP memory validation An SNP cache coherency vulnerability requires a cache line eviction mitigation when validating memory after a page state change to private. The specific mitigation is to touch the first and last byte of each 4K page that is being validated. There is no need to perform the mitigation when performing a page state change to shared and rescinding validation. CPUID bit Fn8000001F_EBX[31] defines the COHERENCY_SFW_NO CPUID bit that, when set, indicates that the software mitigation for this vulnerability is not needed. Implement the mitigation and invoke it when validating memory (making it private) and the COHERENCY_SFW_NO bit is not set, indicating the SNP guest is vulnerable.
In the Linux kernel, the following vulnerability has been resolved: mm/vmalloc: combine all TLB flush operations of KASAN shadow virtual address into one operation When compiling kernel source 'make -j $(nproc)' with the up-and-running KASAN-enabled kernel on a 256-core machine, the following soft lockup is shown: watchdog: BUG: soft lockup - CPU#28 stuck for 22s! [kworker/28:1:1760] CPU: 28 PID: 1760 Comm: kworker/28:1 Kdump: loaded Not tainted 6.10.0-rc5 #95 Workqueue: events drain_vmap_area_work RIP: 0010:smp_call_function_many_cond+0x1d8/0xbb0 Code: 38 c8 7c 08 84 c9 0f 85 49 08 00 00 8b 45 08 a8 01 74 2e 48 89 f1 49 89 f7 48 c1 e9 03 41 83 e7 07 4c 01 e9 41 83 c7 03 f3 90 <0f> b6 01 41 38 c7 7c 08 84 c0 0f 85 d4 06 00 00 8b 45 08 a8 01 75 RSP: 0018:ffffc9000cb3fb60 EFLAGS: 00000202 RAX: 0000000000000011 RBX: ffff8883bc4469c0 RCX: ffffed10776e9949 RDX: 0000000000000002 RSI: ffff8883bb74ca48 RDI: ffffffff8434dc50 RBP: ffff8883bb74ca40 R08: ffff888103585dc0 R09: ffff8884533a1800 R10: 0000000000000004 R11: ffffffffffffffff R12: ffffed1077888d39 R13: dffffc0000000000 R14: ffffed1077888d38 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8883bc400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005577b5c8d158 CR3: 0000000004850000 CR4: 0000000000350ef0 Call Trace: <IRQ> ? watchdog_timer_fn+0x2cd/0x390 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x300/0x6d0 ? sched_clock_cpu+0x69/0x4e0 ? __pfx___hrtimer_run_queues+0x10/0x10 ? srso_return_thunk+0x5/0x5f ? ktime_get_update_offsets_now+0x7f/0x2a0 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? hrtimer_interrupt+0x2ca/0x760 ? __sysvec_apic_timer_interrupt+0x8c/0x2b0 ? sysvec_apic_timer_interrupt+0x6a/0x90 </IRQ> <TASK> ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? smp_call_function_many_cond+0x1d8/0xbb0 ? __pfx_do_kernel_range_flush+0x10/0x10 on_each_cpu_cond_mask+0x20/0x40 flush_tlb_kernel_range+0x19b/0x250 ? srso_return_thunk+0x5/0x5f ? kasan_release_vmalloc+0xa7/0xc0 purge_vmap_node+0x357/0x820 ? __pfx_purge_vmap_node+0x10/0x10 __purge_vmap_area_lazy+0x5b8/0xa10 drain_vmap_area_work+0x21/0x30 process_one_work+0x661/0x10b0 worker_thread+0x844/0x10e0 ? srso_return_thunk+0x5/0x5f ? __kthread_parkme+0x82/0x140 ? __pfx_worker_thread+0x10/0x10 kthread+0x2a5/0x370 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Debugging Analysis: 1. The following ftrace log shows that the lockup CPU spends too much time iterating vmap_nodes and flushing TLB when purging vm_area structures. (Some info is trimmed). kworker: funcgraph_entry: | drain_vmap_area_work() { kworker: funcgraph_entry: | mutex_lock() { kworker: funcgraph_entry: 1.092 us | __cond_resched(); kworker: funcgraph_exit: 3.306 us | } ... ... kworker: funcgraph_entry: | flush_tlb_kernel_range() { ... ... kworker: funcgraph_exit: # 7533.649 us | } ... ... kworker: funcgraph_entry: 2.344 us | mutex_unlock(); kworker: funcgraph_exit: $ 23871554 us | } The drain_vmap_area_work() spends over 23 seconds. There are 2805 flush_tlb_kernel_range() calls in the ftrace log. * One is called in __purge_vmap_area_lazy(). * Others are called by purge_vmap_node->kasan_release_vmalloc. purge_vmap_node() iteratively releases kasan vmalloc allocations and flushes TLB for each vmap_area. - [Rough calculation] Each flush_tlb_kernel_range() runs about 7.5ms. -- 2804 * 7.5ms = 21.03 seconds. -- That's why a soft lock is triggered. 2. Extending the soft lockup time can work around the issue (For example, # echo ---truncated---
In the Linux kernel, the following vulnerability has been resolved: gve: add missing NULL check for gve_alloc_pending_packet() in TX DQO gve_alloc_pending_packet() can return NULL, but gve_tx_add_skb_dqo() did not check for this case before dereferencing the returned pointer. Add a missing NULL check to prevent a potential NULL pointer dereference when allocation fails. This improves robustness in low-memory scenarios.
In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode PCIe endpoints with ATS enabled and passed through to userspace (e.g., QEMU, DPDK) can hard-lock the host when their link drops, either by surprise removal or by a link fault. Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") adds pci_dev_is_disconnected() to devtlb_invalidation_with_pasid() so ATS invalidation is skipped only when the device is being safely removed, but it applies only when Intel IOMMU scalable mode is enabled. With scalable mode disabled or unsupported, a system hard-lock occurs when a PCIe endpoint's link drops because the Intel IOMMU waits indefinitely for an ATS invalidation that cannot complete. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release") adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(), which calls qi_flush_dev_iotlb() and can also hard-lock the system when a PCIe endpoint's link drops. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 intel_context_flush_no_pasid device_pasid_table_teardown pci_pasid_table_teardown pci_for_each_dma_alias intel_pasid_teardown_sm_context intel_iommu_release_device iommu_deinit_device __iommu_group_remove_device iommu_release_device iommu_bus_notifier blocking_notifier_call_chain bus_notify device_del pci_remove_bus_device pci_stop_and_remove_bus_device pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist Sometimes the endpoint loses connection without a link-down event (e.g., due to a link fault); killing the process (virsh destroy) then hard-locks the host. Call Trace: qi_submit_sync qi_flush_dev_iotlb __context_flush_dev_iotlb.part.0 domain_context_clear_one_cb pci_for_each_dma_alias device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput pci_dev_is_disconnected() only covers safe-removal paths; pci_device_is_present() tests accessibility by reading vendor/device IDs and internally calls pci_dev_is_disconnected(). On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs. Since __context_flush_dev_iotlb() is only called on {attach,release}_dev paths (not hot), add pci_device_is_present() there to skip inaccessible devices and avoid the hard-lock.
In the Linux kernel, the following vulnerability has been resolved: net: usb: pegasus: enable basic endpoint checking pegasus_probe() fills URBs with hardcoded endpoint pipes without verifying the endpoint descriptors: - usb_rcvbulkpipe(dev, 1) for RX data - usb_sndbulkpipe(dev, 2) for TX data - usb_rcvintpipe(dev, 3) for status interrupts A malformed USB device can present these endpoints with transfer types that differ from what the driver assumes. Add a pegasus_usb_ep enum for endpoint numbers, replacing magic constants throughout. Add usb_check_bulk_endpoints() and usb_check_int_endpoints() calls before any resource allocation to verify endpoint types before use, rejecting devices with mismatched descriptors at probe time, and avoid triggering assertion. Similar fix to - commit 90b7f2961798 ("net: usb: rtl8150: enable basic endpoint checking") - commit 9e7021d2aeae ("net: usb: catc: enable basic endpoint checking")
In the Linux kernel, the following vulnerability has been resolved: ceph: do not propagate page array emplacement errors as batch errors When fscrypt is enabled, move_dirty_folio_in_page_array() may fail because it needs to allocate bounce buffers to store the encrypted versions of each folio. Each folio beyond the first allocates its bounce buffer with GFP_NOWAIT. Failures are common (and expected) under this allocation mode; they should flush (not abort) the batch. However, ceph_process_folio_batch() uses the same `rc` variable for its own return code and for capturing the return codes of its routine calls; failing to reset `rc` back to 0 results in the error being propagated out to the main writeback loop, which cannot actually tolerate any errors here: once `ceph_wbc.pages` is allocated, it must be passed to ceph_submit_write() to be freed. If it survives until the next iteration (e.g. due to the goto being followed), ceph_allocate_page_array()'s BUG_ON() will oops the worker. Note that this failure mode is currently masked due to another bug (addressed next in this series) that prevents multiple encrypted folios from being selected for the same write. For now, just reset `rc` when redirtying the folio to prevent errors in move_dirty_folio_in_page_array() from propagating. Note that move_dirty_folio_in_page_array() is careful never to return errors on the first folio, so there is no need to check for that. After this change, ceph_process_folio_batch() no longer returns errors; its only remaining failure indicator is `locked_pages == 0`, which the caller already handles correctly.
In the Linux kernel, the following vulnerability has been resolved: usb: image: mdc800: kill download URB on timeout mdc800_device_read() submits download_urb and waits for completion. If the timeout fires and the device has not responded, the function returns without killing the URB, leaving it active. A subsequent read() resubmits the same URB while it is still in-flight, triggering the WARN in usb_submit_urb(): "URB submitted while active" Check the return value of wait_event_timeout() and kill the URB if it indicates timeout, ensuring the URB is complete before its status is inspected or the URB is resubmitted. Similar to - commit 372c93131998 ("USB: yurex: fix control-URB timeout handling") - commit b98d5000c505 ("media: rc: iguanair: handle timeouts")
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: add upper bound check on user inputs in signal ioctl Huge input values in amdgpu_userq_signal_ioctl can lead to a OOM and could be exploited. So check these input value against AMDGPU_USERQ_MAX_HANDLES which is big enough value for genuine use cases and could potentially avoid OOM. (cherry picked from commit be267e15f99bc97cbe202cd556717797cdcf79a5)
In the Linux kernel, the following vulnerability has been resolved: ublk: fix NULL pointer dereference in ublk_ctrl_set_size() ublk_ctrl_set_size() unconditionally dereferences ub->ub_disk via set_capacity_and_notify() without checking if it is NULL. ub->ub_disk is NULL before UBLK_CMD_START_DEV completes (it is only assigned in ublk_ctrl_start_dev()) and after UBLK_CMD_STOP_DEV runs (ublk_detach_disk() sets it to NULL). Since the UBLK_CMD_UPDATE_SIZE handler performs no state validation, a user can trigger a NULL pointer dereference by sending UPDATE_SIZE to a device that has been added but not yet started, or one that has been stopped. Fix this by checking ub->ub_disk under ub->mutex before dereferencing it, and returning -ENODEV if the disk is not available.
In the Linux kernel, the following vulnerability has been resolved: x86: shadow stacks: proper error handling for mmap lock 김영민 reports that shstk_pop_sigframe() doesn't check for errors from mmap_read_lock_killable(), which is a silly oversight, and also shows that we haven't marked those functions with "__must_check", which would have immediately caught it. So let's fix both issues.
In the Linux kernel, the following vulnerability has been resolved: media: ccs: Avoid possible division by zero Calculating maximum M for scaler configuration involves dividing by MIN_X_OUTPUT_SIZE limit register's value. Albeit the value is presumably non-zero, the driver was missing the check it in fact was. Fix this.
In the Linux kernel, the following vulnerability has been resolved: drm/amd/pm: Fix null pointer dereference issue If SMU is disabled, during RAS initialization, there will be null pointer dereference issue here.
In the Linux kernel, the following vulnerability has been resolved: bridge: guard local VLAN-0 FDB helpers against NULL vlan group When CONFIG_BRIDGE_VLAN_FILTERING is not set, br_vlan_group() and nbp_vlan_group() return NULL (br_private.h stub definitions). The BR_BOOLOPT_FDB_LOCAL_VLAN_0 toggle code is compiled unconditionally and reaches br_fdb_delete_locals_per_vlan_port() and br_fdb_insert_locals_per_vlan_port(), where the NULL vlan group pointer is dereferenced via list_for_each_entry(v, &vg->vlan_list, vlist). The observed crash is in the delete path, triggered when creating a bridge with IFLA_BR_MULTI_BOOLOPT containing BR_BOOLOPT_FDB_LOCAL_VLAN_0 via RTM_NEWLINK. The insert helper has the same bug pattern. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7] RIP: 0010:br_fdb_delete_locals_per_vlan+0x2b9/0x310 Call Trace: br_fdb_toggle_local_vlan_0+0x452/0x4c0 br_toggle_fdb_local_vlan_0+0x31/0x80 net/bridge/br.c:276 br_boolopt_toggle net/bridge/br.c:313 br_boolopt_multi_toggle net/bridge/br.c:364 br_changelink net/bridge/br_netlink.c:1542 br_dev_newlink net/bridge/br_netlink.c:1575 Add NULL checks for the vlan group pointer in both helpers, returning early when there are no VLANs to iterate. This matches the existing pattern used by other bridge FDB functions such as br_fdb_add() and br_fdb_delete().
In the Linux kernel, the following vulnerability has been resolved: erofs: fix incorrect early exits for invalid metabox-enabled images Crafted EROFS images with metadata compression enabled can trigger incorrect early returns, leading to folio reference leaks. However, this does not cause system crashes or other severe issues.
In the Linux kernel, the following vulnerability has been resolved: xprtrdma: Decrement re_receiving on the early exit paths In the event that rpcrdma_post_recvs() fails to create a work request (due to memory allocation failure, say) or otherwise exits early, we should decrement ep->re_receiving before returning. Otherwise we will hang in rpcrdma_xprt_drain() as re_receiving will never reach zero and the completion will never be triggered. On a system with high memory pressure, this can appear as the following hung task: INFO: task kworker/u385:17:8393 blocked for more than 122 seconds. Tainted: G S E 6.19.0 #3 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u385:17 state:D stack:0 pid:8393 tgid:8393 ppid:2 task_flags:0x4248060 flags:0x00080000 Workqueue: xprtiod xprt_autoclose [sunrpc] Call Trace: <TASK> __schedule+0x48b/0x18b0 ? ib_post_send_mad+0x247/0xae0 [ib_core] schedule+0x27/0xf0 schedule_timeout+0x104/0x110 __wait_for_common+0x98/0x180 ? __pfx_schedule_timeout+0x10/0x10 wait_for_completion+0x24/0x40 rpcrdma_xprt_disconnect+0x444/0x460 [rpcrdma] xprt_rdma_close+0x12/0x40 [rpcrdma] xprt_autoclose+0x5f/0x120 [sunrpc] process_one_work+0x191/0x3e0 worker_thread+0x2e3/0x420 ? __pfx_worker_thread+0x10/0x10 kthread+0x10d/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x273/0x2b0 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30
In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: Fix "scheduling while atomic" in IPsec MAC address query Fix a "scheduling while atomic" bug in mlx5e_ipsec_init_macs() by replacing mlx5_query_mac_address() with ether_addr_copy() to get the local MAC address directly from netdev->dev_addr. The issue occurs because mlx5_query_mac_address() queries the hardware which involves mlx5_cmd_exec() that can sleep, but it is called from the mlx5e_ipsec_handle_event workqueue which runs in atomic context. The MAC address is already available in netdev->dev_addr, so no need to query hardware. This avoids the sleeping call and resolves the bug. Call trace: BUG: scheduling while atomic: kworker/u112:2/69344/0x00000200 __schedule+0x7ab/0xa20 schedule+0x1c/0xb0 schedule_timeout+0x6e/0xf0 __wait_for_common+0x91/0x1b0 cmd_exec+0xa85/0xff0 [mlx5_core] mlx5_cmd_exec+0x1f/0x50 [mlx5_core] mlx5_query_nic_vport_mac_address+0x7b/0xd0 [mlx5_core] mlx5_query_mac_address+0x19/0x30 [mlx5_core] mlx5e_ipsec_init_macs+0xc1/0x720 [mlx5_core] mlx5e_ipsec_build_accel_xfrm_attrs+0x422/0x670 [mlx5_core] mlx5e_ipsec_handle_event+0x2b9/0x460 [mlx5_core] process_one_work+0x178/0x2e0 worker_thread+0x2ea/0x430
In the Linux kernel, the following vulnerability has been resolved: btrfs: don't BUG() on unexpected delayed ref type in run_one_delayed_ref() There is no need to BUG(), we can just return an error and log an error message.
In the Linux kernel, the following vulnerability has been resolved: net/mlx5: Fix deadlock between devlink lock and esw->wq esw->work_queue executes esw_functions_changed_event_handler -> esw_vfs_changed_event_handler and acquires the devlink lock. .eswitch_mode_set (acquires devlink lock in devlink_nl_pre_doit) -> mlx5_devlink_eswitch_mode_set -> mlx5_eswitch_disable_locked -> mlx5_eswitch_event_handler_unregister -> flush_workqueue deadlocks when esw_vfs_changed_event_handler executes. Fix that by no longer flushing the work to avoid the deadlock, and using a generation counter to keep track of work relevance. This avoids an old handler manipulating an esw that has undergone one or more mode changes: - the counter is incremented in mlx5_eswitch_event_handler_unregister. - the counter is read and passed to the ephemeral mlx5_host_work struct. - the work handler takes the devlink lock and bails out if the current generation is different than the one it was scheduled to operate on. - mlx5_eswitch_cleanup does the final draining before destroying the wq. No longer flushing the workqueue has the side effect of maybe no longer cancelling pending vport_change_handler work items, but that's ok since those are disabled elsewhere: - mlx5_eswitch_disable_locked disables the vport eq notifier. - mlx5_esw_vport_disable disarms the HW EQ notification and marks vport->enabled under state_lock to false to prevent pending vport handler from doing anything. - mlx5_eswitch_cleanup destroys the workqueue and makes sure all events are disabled/finished.
In the Linux kernel, the following vulnerability has been resolved: ACPI: processor: Update cpuidle driver check in __acpi_processor_start() Commit 7a8c994cbb2d ("ACPI: processor: idle: Optimize ACPI idle driver registration") moved the ACPI idle driver registration to acpi_processor_driver_init() and acpi_processor_power_init() does not register an idle driver any more. Accordingly, the cpuidle driver check in __acpi_processor_start() needs to be updated to avoid calling acpi_processor_power_init() without a cpuidle driver, in which case the registration of the cpuidle device in that function would lead to a NULL pointer dereference in __cpuidle_register_device().
In the Linux kernel, the following vulnerability has been resolved: usb: xhci: Fix memory leak in xhci_disable_slot() xhci_alloc_command() allocates a command structure and, when the second argument is true, also allocates a completion structure. Currently, the error handling path in xhci_disable_slot() only frees the command structure using kfree(), causing the completion structure to leak. Use xhci_free_command() instead of kfree(). xhci_free_command() correctly frees both the command structure and the associated completion structure. Since the command structure is allocated with zero-initialization, command->in_ctx is NULL and will not be erroneously freed by xhci_free_command(). This bug was found using an experimental static analysis tool we are developing. The tool is based on the LLVM framework and is specifically designed to detect memory management issues. It is currently under active development and not yet publicly available, but we plan to open-source it after our research is published. The bug was originally detected on v6.13-rc1 using our static analysis tool, and we have verified that the issue persists in the latest mainline kernel. We performed build testing on x86_64 with allyesconfig using GCC=11.4.0. Since triggering these error paths in xhci_disable_slot() requires specific hardware conditions or abnormal state, we were unable to construct a test case to reliably trigger these specific error paths at runtime.
In the Linux kernel, the following vulnerability has been resolved: btrfs: reject root items with drop_progress and zero drop_level [BUG] When recovering relocation at mount time, merge_reloc_root() and btrfs_drop_snapshot() both use BUG_ON(level == 0) to guard against an impossible state: a non-zero drop_progress combined with a zero drop_level in a root_item, which can be triggered: ------------[ cut here ]------------ kernel BUG at fs/btrfs/relocation.c:1545! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 1 UID: 0 PID: 283 ... Tainted: 6.18.0+ #16 PREEMPT(voluntary) Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU Ubuntu 24.04 PC v2, BIOS 1.16.3-debian-1.16.3-2 RIP: 0010:merge_reloc_root+0x1266/0x1650 fs/btrfs/relocation.c:1545 Code: ffff0000 00004589 d7e9acfa ffffe8a1 79bafebe 02000000 Call Trace: merge_reloc_roots+0x295/0x890 fs/btrfs/relocation.c:1861 btrfs_recover_relocation+0xd6e/0x11d0 fs/btrfs/relocation.c:4195 btrfs_start_pre_rw_mount+0xa4d/0x1810 fs/btrfs/disk-io.c:3130 open_ctree+0x5824/0x5fe0 fs/btrfs/disk-io.c:3640 btrfs_fill_super fs/btrfs/super.c:987 [inline] btrfs_get_tree_super fs/btrfs/super.c:1951 [inline] btrfs_get_tree_subvol fs/btrfs/super.c:2094 [inline] btrfs_get_tree+0x111c/0x2190 fs/btrfs/super.c:2128 vfs_get_tree+0x9a/0x370 fs/super.c:1758 fc_mount fs/namespace.c:1199 [inline] do_new_mount_fc fs/namespace.c:3642 [inline] do_new_mount fs/namespace.c:3718 [inline] path_mount+0x5b8/0x1ea0 fs/namespace.c:4028 do_mount fs/namespace.c:4041 [inline] __do_sys_mount fs/namespace.c:4229 [inline] __se_sys_mount fs/namespace.c:4206 [inline] __x64_sys_mount+0x282/0x320 fs/namespace.c:4206 ... RIP: 0033:0x7f969c9a8fde Code: 0f1f4000 48c7c2b0 fffffff7 d8648902 b8ffffff ffc3660f ---[ end trace 0000000000000000 ]--- The bug is reproducible on 7.0.0-rc2-next-20260310 with our dynamic metadata fuzzing tool that corrupts btrfs metadata at runtime. [CAUSE] A non-zero drop_progress.objectid means an interrupted btrfs_drop_snapshot() left a resume point on disk, and in that case drop_level must be greater than 0 because the checkpoint is only saved at internal node levels. Although this invariant is enforced when the kernel writes the root item, it is not validated when the root item is read back from disk. That allows on-disk corruption to provide an invalid state with drop_progress.objectid != 0 and drop_level == 0. When relocation recovery later processes such a root item, merge_reloc_root() reads drop_level and hits BUG_ON(level == 0). The same invalid metadata can also trigger the corresponding BUG_ON() in btrfs_drop_snapshot(). [FIX] Fix this by validating the root_item invariant in tree-checker when reading root items from disk: if drop_progress.objectid is non-zero, drop_level must also be non-zero. Reject such malformed metadata with -EUCLEAN before it reaches merge_reloc_root() or btrfs_drop_snapshot() and triggers the BUG_ON. After the fix, the same corruption is correctly rejected by tree-checker and the BUG_ON is no longer triggered.
In the Linux kernel, the following vulnerability has been resolved: mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node When CONFIG_PAGE_OWNER is enabled, freeing KASAN shadow pages during vmalloc cleanup triggers expensive stack unwinding that acquires RCU read locks. Processing a large purge_list without rescheduling can cause the task to hold CPU for extended periods (10+ seconds), leading to RCU stalls and potential OOM conditions. The issue manifests in purge_vmap_node() -> kasan_release_vmalloc_node() where iterating through hundreds or thousands of vmap_area entries and freeing their associated shadow pages causes: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P6229/1:b..l ... task:kworker/0:17 state:R running task stack:28840 pid:6229 ... kasan_release_vmalloc_node+0x1ba/0xad0 mm/vmalloc.c:2299 purge_vmap_node+0x1ba/0xad0 mm/vmalloc.c:2299 Each call to kasan_release_vmalloc() can free many pages, and with page_owner tracking, each free triggers save_stack() which performs stack unwinding under RCU read lock. Without yielding, this creates an unbounded RCU critical section. Add periodic cond_resched() calls within the loop to allow: - RCU grace periods to complete - Other tasks to run - Scheduler to preempt when needed The fix uses need_resched() for immediate response under load, with a batch count of 32 as a guaranteed upper bound to prevent worst-case stalls even under light load.
In the Linux kernel, the following vulnerability has been resolved: accel/amdxdna: Fix runtime suspend deadlock when there is pending job The runtime suspend callback drains the running job workqueue before suspending the device. If a job is still executing and calls pm_runtime_resume_and_get(), it can deadlock with the runtime suspend path. Fix this by moving pm_runtime_resume_and_get() from the job execution routine to the job submission routine, ensuring the device is resumed before the job is queued and avoiding the deadlock during runtime suspend.
In the Linux kernel, the following vulnerability has been resolved: PCI: qcom-ep: Move controller cleanups to qcom_pcie_perst_deassert() Currently, the endpoint cleanup function dw_pcie_ep_cleanup() and EPF deinit notify function pci_epc_deinit_notify() are called during the execution of qcom_pcie_perst_assert() i.e., when the host has asserted PERST#. But quickly after this step, refclk will also be disabled by the host. All of the Qcom endpoint SoCs supported as of now depend on the refclk from the host for keeping the controller operational. Due to this limitation, any access to the hardware registers in the absence of refclk will result in a whole endpoint crash. Unfortunately, most of the controller cleanups require accessing the hardware registers (like eDMA cleanup performed in dw_pcie_ep_cleanup(), powering down MHI EPF etc...). So these cleanup functions are currently causing the crash in the endpoint SoC once host asserts PERST#. One way to address this issue is by generating the refclk in the endpoint itself and not depending on the host. But that is not always possible as some of the endpoint designs do require the endpoint to consume refclk from the host (as I was told by the Qcom engineers). Thus, fix this crash by moving the controller cleanups to the start of the qcom_pcie_perst_deassert() function. qcom_pcie_perst_deassert() is called whenever the host has deasserted PERST# and it is guaranteed that the refclk would be active at this point. So at the start of this function (after enabling resources), the controller cleanup can be performed. Once finished, rest of the code execution for PERST# deassert can continue as usual.
In the Linux kernel, the following vulnerability has been resolved: media: chips-media: wave5: Fix kthread worker destruction in polling mode Fix the cleanup order in polling mode (irq < 0) to prevent kernel warnings during module removal. Cancel the hrtimer before destroying the kthread worker to ensure work queues are empty. In polling mode, the driver uses hrtimer to periodically trigger wave5_vpu_timer_callback() which queues work via kthread_queue_work(). The kthread_destroy_worker() function validates that both work queues are empty with WARN_ON(!list_empty(&worker->work_list)) and WARN_ON(!list_empty(&worker->delayed_work_list)). The original code called kthread_destroy_worker() before hrtimer_cancel(), creating a race condition where the timer could fire during worker destruction and queue new work, triggering the WARN_ON. This causes the following warning on every module unload in polling mode: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1034 at kernel/kthread.c:1430 kthread_destroy_worker+0x84/0x98 Modules linked in: wave5(-) rpmsg_ctrl rpmsg_char ... Call trace: kthread_destroy_worker+0x84/0x98 wave5_vpu_remove+0xc8/0xe0 [wave5] platform_remove+0x30/0x58 ... ---[ end trace 0000000000000000 ]---
In the Linux kernel, the following vulnerability has been resolved: net/rds: No shortcut out of RDS_CONN_ERROR RDS connections carry a state "rds_conn_path::cp_state" and transitions from one state to another and are conditional upon an expected state: "rds_conn_path_transition." There is one exception to this conditionality, which is "RDS_CONN_ERROR" that can be enforced by "rds_conn_path_drop" regardless of what state the condition is currently in. But as soon as a connection enters state "RDS_CONN_ERROR", the connection handling code expects it to go through the shutdown-path. The RDS/TCP multipath changes added a shortcut out of "RDS_CONN_ERROR" straight back to "RDS_CONN_CONNECTING" via "rds_tcp_accept_one_path" (e.g. after "rds_tcp_state_change"). A subsequent "rds_tcp_reset_callbacks" can then transition the state to "RDS_CONN_RESETTING" with a shutdown-worker queued. That'll trip up "rds_conn_init_shutdown", which was never adjusted to handle "RDS_CONN_RESETTING" and subsequently drops the connection with the dreaded "DR_INV_CONN_STATE", which leaves "RDS_SHUTDOWN_WORK_QUEUED" on forever. So we do two things here: a) Don't shortcut "RDS_CONN_ERROR", but take the longer path through the shutdown code. b) Add "RDS_CONN_RESETTING" to the expected states in "rds_conn_init_shutdown" so that we won't error out and get stuck, if we ever hit weird state transitions like this again."
In the Linux kernel, the following vulnerability has been resolved: bnxt_en: set backing store type from query type bnxt_hwrm_func_backing_store_qcaps_v2() stores resp->type from the firmware response in ctxm->type and later uses that value to index fixed backing-store metadata arrays such as ctx_arr[] and bnxt_bstore_to_trace[]. ctxm->type is fixed by the current backing-store query type and matches the array index of ctx->ctx_arr. Set ctxm->type from the current loop variable instead of depending on resp->type. Also update the loop to advance type from next_valid_type in the for statement, which keeps the control flow simpler for non-valid and unchanged entries.
In the Linux kernel, the following vulnerability has been resolved: firmware: stratix10-rsu: Fix NULL pointer dereference when RSU is disabled When the Remote System Update (RSU) isn't enabled in the First Stage Boot Loader (FSBL), the driver encounters a NULL pointer dereference when excute svc_normal_to_secure_thread() thread, resulting in a kernel panic: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Mem abort info: ... Data abort info: ... [0000000000000008] user address but active_mm is swapper Internal error: Oops: 0000000096000004 [#1] SMP Modules linked in: CPU: 0 UID: 0 PID: 79 Comm: svc_smc_hvc_thr Not tainted 6.19.0-rc8-yocto-standard+ #59 PREEMPT Hardware name: SoCFPGA Stratix 10 SoCDK (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : svc_normal_to_secure_thread+0x38c/0x990 lr : svc_normal_to_secure_thread+0x144/0x990 ... Call trace: svc_normal_to_secure_thread+0x38c/0x990 (P) kthread+0x150/0x210 ret_from_fork+0x10/0x20 Code: 97cfc113 f9400260 aa1403e1 f9400400 (f9400402) ---[ end trace 0000000000000000 ]--- The issue occurs because rsu_send_async_msg() fails when RSU is not enabled in firmware, causing the channel to be freed via stratix10_svc_free_channel(). However, the probe function continues execution and registers svc_normal_to_secure_thread(), which subsequently attempts to access the already-freed channel, triggering the NULL pointer dereference. Fix this by properly cleaning up the async client and returning early on failure, preventing the thread from being used with an invalid channel.
In the Linux kernel, the following vulnerability has been resolved: ASoC: qcom: qdsp6: Fix q6apm remove ordering during ADSP stop and start During ADSP stop and start, the kernel crashes due to the order in which ASoC components are removed. On ADSP stop, the q6apm-audio .remove callback unloads topology and removes PCM runtimes during ASoC teardown. This deletes the RTDs that contain the q6apm DAI components before their removal pass runs, leaving those components still linked to the card and causing crashes on the next rebind. Fix this by ensuring that all dependent (child) components are removed first, and the q6apm component is removed last. [ 48.105720] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0 [ 48.114763] Mem abort info: [ 48.117650] ESR = 0x0000000096000004 [ 48.121526] EC = 0x25: DABT (current EL), IL = 32 bits [ 48.127010] SET = 0, FnV = 0 [ 48.130172] EA = 0, S1PTW = 0 [ 48.133415] FSC = 0x04: level 0 translation fault [ 48.138446] Data abort info: [ 48.141422] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 48.147079] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 48.152354] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 48.157859] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001173cf000 [ 48.164517] [00000000000000d0] pgd=0000000000000000, p4d=0000000000000000 [ 48.171530] Internal error: Oops: 0000000096000004 [#1] SMP [ 48.177348] Modules linked in: q6prm_clocks q6apm_lpass_dais q6apm_dai snd_q6dsp_common q6prm snd_q6apm 8021q garp mrp stp llc snd_soc_hdmi_codec apr pdr_interface phy_qcom_edp fastrpc qcom_pd_mapper rpmsg_ctrl qrtr_smd rpmsg_char qcom_pdr_msg qcom_iris v4l2_mem2mem videobuf2_dma_contig ath11k_pci msm ubwc_config at24 ath11k videobuf2_memops mac80211 ocmem videobuf2_v4l2 libarc4 drm_gpuvm mhi qrtr videodev drm_exec snd_soc_sc8280xp gpu_sched videobuf2_common nvmem_qcom_spmi_sdam snd_soc_qcom_sdw drm_dp_aux_bus qcom_q6v5_pas qcom_spmi_temp_alarm snd_soc_qcom_common rtc_pm8xxx qcom_pon drm_display_helper cec qcom_pil_info qcom_stats soundwire_bus drm_client_lib mc dispcc0_sa8775p videocc_sa8775p qcom_q6v5 camcc_sa8775p snd_soc_dmic phy_qcom_sgmii_eth snd_soc_max98357a i2c_qcom_geni snd_soc_core dwmac_qcom_ethqos llcc_qcom icc_bwmon qcom_sysmon snd_compress qcom_refgen_regulator coresight_stm stmmac_platform snd_pcm_dmaengine qcom_common coresight_tmc stmmac coresight_replicator qcom_glink_smem coresight_cti stm_core [ 48.177444] coresight_funnel snd_pcm ufs_qcom phy_qcom_qmp_usb gpi phy_qcom_snps_femto_v2 coresight phy_qcom_qmp_ufs qcom_wdt gpucc_sa8775p pcs_xpcs mdt_loader qcom_ice icc_osm_l3 qmi_helpers snd_timer snd soundcore display_connector qcom_rng nvmem_reboot_mode drm_kms_helper phy_qcom_qmp_pcie sha256 cfg80211 rfkill socinfo fuse drm backlight ipv6 [ 48.301059] CPU: 2 UID: 0 PID: 293 Comm: kworker/u32:2 Not tainted 6.19.0-rc6-dirty #10 PREEMPT [ 48.310081] Hardware name: Qualcomm Technologies, Inc. Lemans EVK (DT) [ 48.316782] Workqueue: pdr_notifier_wq pdr_notifier_work [pdr_interface] [ 48.323672] pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 48.330825] pc : mutex_lock+0xc/0x54 [ 48.334514] lr : soc_dapm_shutdown_dapm+0x44/0x174 [snd_soc_core] [ 48.340794] sp : ffff800084ddb7b0 [ 48.344207] x29: ffff800084ddb7b0 x28: ffff00009cd9cf30 x27: ffff00009cd9cc00 [ 48.351544] x26: ffff000099610190 x25: ffffa31d2f19c810 x24: ffffa31d2f185098 [ 48.358869] x23: ffff800084ddb7f8 x22: 0000000000000000 x21: 00000000000000d0 [ 48.366198] x20: ffff00009ba6c338 x19: ffff00009ba6c338 x18: 00000000ffffffff [ 48.373528] x17: 000000040044ffff x16: ffffa31d4ae6dca8 x15: 072007740775076f [ 48.380853] x14: 0765076d07690774 x13: 00313a323a656369 x12: 767265733a637673 [ 48.388182] x11: 00000000000003f9 x10: ffffa31d4c7dea98 x9 : 0000000000000001 [ 48.395519] x8 : ffff00009a2aadc0 x7 : 0000000000000003 x6 : 0000000000000000 [ 48.402854] x5 : 0000000000000 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: tipc: fix divide-by-zero in tipc_sk_filter_connect() A user can set conn_timeout to any value via setsockopt(TIPC_CONN_TIMEOUT), including values less than 4. When a SYN is rejected with TIPC_ERR_OVERLOAD and the retry path in tipc_sk_filter_connect() executes: delay %= (tsk->conn_timeout / 4); If conn_timeout is in the range [0, 3], the integer division yields 0, and the modulo operation triggers a divide-by-zero exception, causing a kernel oops/panic. Fix this by clamping conn_timeout to a minimum of 4 at the point of use in tipc_sk_filter_connect(). Oops: divide error: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 119 Comm: poc-F144 Not tainted 7.0.0-rc2+ RIP: 0010:tipc_sk_filter_rcv (net/tipc/socket.c:2236 net/tipc/socket.c:2362) Call Trace: tipc_sk_backlog_rcv (include/linux/instrumented.h:82 include/linux/atomic/atomic-instrumented.h:32 include/net/sock.h:2357 net/tipc/socket.c:2406) __release_sock (include/net/sock.h:1185 net/core/sock.c:3213) release_sock (net/core/sock.c:3797) tipc_connect (net/tipc/socket.c:2570) __sys_connect (include/linux/file.h:62 include/linux/file.h:83 net/socket.c:2098)
In the Linux kernel, the following vulnerability has been resolved: net: correctly handle tunneled traffic on IPV6_CSUM GSO fallback NETIF_F_IPV6_CSUM only advertises support for checksum offload of packets without IPv6 extension headers. Packets with extension headers must fall back onto software checksumming. Since TSO depends on checksum offload, those must revert to GSO. The below commit introduces that fallback. It always checks network header length. For tunneled packets, the inner header length must be checked instead. Extend the check accordingly. A special case is tunneled packets without inner IP protocol. Such as RFC 6951 SCTP in UDP. Those are not standard IPv6 followed by transport header either, so also must revert to the software GSO path.
In the Linux kernel, the following vulnerability has been resolved: drm/amd/display: Add more checks for DSC / HUBP ONO guarantees [WHY] For non-zero DSC instances it's possible that the HUBP domain required to drive it for sequential ONO ASICs isn't met, potentially causing the logic to the tile to enter an undefined state leading to a system hang. [HOW] Add more checks to ensure that the HUBP domain matching the DSC instance is appropriately powered. (cherry picked from commit da63df07112e5a9857a8d2aaa04255c4206754ec)
In the Linux kernel, the following vulnerability has been resolved: IB/IPoIB: Fix legacy IPoIB due to wrong number of queues The cited commit creates child PKEY interfaces over netlink will multiple tx and rx queues, but some devices doesn't support more than 1 tx and 1 rx queues. This causes to a crash when traffic is sent over the PKEY interface due to the parent having a single queue but the child having multiple queues. This patch fixes the number of queues to 1 for legacy IPoIB at the earliest possible point in time. BUG: kernel NULL pointer dereference, address: 000000000000036b PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 4 PID: 209665 Comm: python3 Not tainted 6.1.0_for_upstream_min_debug_2022_12_12_17_02 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:kmem_cache_alloc+0xcb/0x450 Code: ce 7e 49 8b 50 08 49 83 78 10 00 4d 8b 28 0f 84 cb 02 00 00 4d 85 ed 0f 84 c2 02 00 00 41 8b 44 24 28 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 b8 41 8b RSP: 0018:ffff88822acbbab8 EFLAGS: 00010202 RAX: 0000000000000070 RBX: ffff8881c28e3e00 RCX: 00000000064f8dae RDX: 00000000064f8dad RSI: 0000000000000a20 RDI: 0000000000030d00 RBP: 0000000000000a20 R08: ffff8882f5d30d00 R09: ffff888104032f40 R10: ffff88810fade828 R11: 736f6d6570736575 R12: ffff88810081c000 R13: 00000000000002fb R14: ffffffff817fc865 R15: 0000000000000000 FS: 00007f9324ff9700(0000) GS:ffff8882f5d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000036b CR3: 00000001125af004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_clone+0x55/0xd0 ip6_finish_output2+0x3fe/0x690 ip6_finish_output+0xfa/0x310 ip6_send_skb+0x1e/0x60 udp_v6_send_skb+0x1e5/0x420 udpv6_sendmsg+0xb3c/0xe60 ? ip_mc_finish_output+0x180/0x180 ? __switch_to_asm+0x3a/0x60 ? __switch_to_asm+0x34/0x60 sock_sendmsg+0x33/0x40 __sys_sendto+0x103/0x160 ? _copy_to_user+0x21/0x30 ? kvm_clock_get_cycles+0xd/0x10 ? ktime_get_ts64+0x49/0xe0 __x64_sys_sendto+0x25/0x30 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f9374f1ed14 Code: 42 41 f8 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 68 41 f8 ff 48 8b RSP: 002b:00007f9324ff7bd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f9324ff7cc8 RCX: 00007f9374f1ed14 RDX: 00000000000002fb RSI: 00007f93000052f0 RDI: 0000000000000030 RBP: 0000000000000000 R08: 00007f9324ff7d40 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 000000012a05f200 R14: 0000000000000001 R15: 00007f9374d57bdc </TASK>
In the Linux kernel, the following vulnerability has been resolved: erofs: fix incorrect early exits in volume label handling Crafted EROFS images containing valid volume labels can trigger incorrect early returns, leading to folio reference leaks. However, this does not cause system crashes or other severe issues.
In the Linux kernel, the following vulnerability has been resolved: ipmi: ipmb: initialise event handler read bytes IPMB doesn't use i2c reads, but the handler needs to set a value. Otherwise an i2c read will return an uninitialised value from the bus driver.
In the Linux kernel, the following vulnerability has been resolved: io_uring/zcrx: fix sgtable leak on mapping failures In an unlikely case when io_populate_area_dma() fails, which could only happen on a PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA machine, io_zcrx_map_area() will have an initialised and not freed table. It was supposed to be cleaned up in the error path, but !is_mapped prevents that.
In the Linux kernel, the following vulnerability has been resolved: drm/amd/display: Add signal type check for dcn401 get_phyd32clk_src Trying to access link enc on a dpia link will cause a crash otherwise
In the Linux kernel, the following vulnerability has been resolved: powerpc, perf: Check that current->mm is alive before getting user callchain It may happen that mm is already released, which leads to kernel panic. This adds the NULL check for current->mm, similarly to commit 20afc60f892d ("x86, perf: Check that current->mm is alive before getting user callchain"). I was getting this panic when running a profiling BPF program (profile.py from bcc-tools): [26215.051935] Kernel attempted to read user page (588) - exploit attempt? (uid: 0) [26215.051950] BUG: Kernel NULL pointer dereference on read at 0x00000588 [26215.051952] Faulting instruction address: 0xc00000000020fac0 [26215.051957] Oops: Kernel access of bad area, sig: 11 [#1] [...] [26215.052049] Call Trace: [26215.052050] [c000000061da6d30] [c00000000020fc10] perf_callchain_user_64+0x2d0/0x490 (unreliable) [26215.052054] [c000000061da6dc0] [c00000000020f92c] perf_callchain_user+0x1c/0x30 [26215.052057] [c000000061da6de0] [c0000000005ab2a0] get_perf_callchain+0x100/0x360 [26215.052063] [c000000061da6e70] [c000000000573bc8] bpf_get_stackid+0x88/0xf0 [26215.052067] [c000000061da6ea0] [c008000000042258] bpf_prog_16d4ab9ab662f669_do_perf_event+0xf8/0x274 [...] In addition, move storing the top-level stack entry to generic perf_callchain_user to make sure the top-evel entry is always captured, even if current->mm is NULL. [Maddy: fixed message to avoid checkpatch format style error]
In the Linux kernel, the following vulnerability has been resolved: USB: core: Limit the length of unkillable synchronous timeouts The usb_control_msg(), usb_bulk_msg(), and usb_interrupt_msg() APIs in usbcore allow unlimited timeout durations. And since they use uninterruptible waits, this leaves open the possibility of hanging a task for an indefinitely long time, with no way to kill it short of unplugging the target device. To prevent this sort of problem, enforce a maximum limit on the length of these unkillable timeouts. The limit chosen here, somewhat arbitrarily, is 60 seconds. On many systems (although not all) this is short enough to avoid triggering the kernel's hung-task detector. In addition, clear up the ambiguity of negative timeout values by treating them the same as 0, i.e., using the maximum allowed timeout.
In the Linux kernel, the following vulnerability has been resolved: drm/amd/display: Fix dsc eDP issue [why] Need to add function hook check before use
iDrive RemotePC before 4.0.1 on Linux allows denial of service. A remote and unauthenticated attacker can disconnect a valid user session by connecting to an ephemeral port.
In the Linux kernel, the following vulnerability has been resolved: iio: imu: adis: Fix NULL pointer dereference in adis_init The adis_init() function dereferences adis->ops to check if the individual function pointers (write, read, reset) are NULL, but does not first check if adis->ops itself is NULL. Drivers like adis16480, adis16490, adis16545 and others do not set custom ops and rely on adis_init() assigning the defaults. Since struct adis is zero-initialized by devm_iio_device_alloc(), adis->ops is NULL when adis_init() is called, causing a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 pc : adis_init+0xc0/0x118 Call trace: adis_init+0xc0/0x118 adis16480_probe+0xe0/0x670 Fix this by checking if adis->ops is NULL before dereferencing it, falling through to assign the default ops in that case.
In the Linux kernel, the following vulnerability has been resolved: minix: Add required sanity checking to minix_check_superblock() The fs/minix implementation of the minix filesystem does not currently support any other value for s_log_zone_size than 0. This is also the only value supported in util-linux; see mkfs.minix.c line 511. In addition, this patch adds some sanity checking for the other minix superblock fields, and moves the minix_blocks_needed() checks for the zmap and imap also to minix_check_super_block(). This also closes a related syzbot bug report.
In the Linux kernel, the following vulnerability has been resolved: drm/panel: Fix a possible null-pointer dereference in jdi_panel_dsi_remove() In jdi_panel_dsi_remove(), jdi is explicitly checked, indicating that it may be NULL: if (!jdi) mipi_dsi_detach(dsi); However, when jdi is NULL, the function does not return and continues by calling jdi_panel_disable(): err = jdi_panel_disable(&jdi->base); Inside jdi_panel_disable(), jdi is dereferenced unconditionally, which can lead to a NULL-pointer dereference: struct jdi_panel *jdi = to_panel_jdi(panel); backlight_disable(jdi->backlight); To prevent such a potential NULL-pointer dereference, return early from jdi_panel_dsi_remove() when jdi is NULL.