In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu/sdma4: replace BUG_ON with WARN_ON in fence emission sdma_v4_0_ring_emit_fence() contains two BUG_ON(addr & 0x3) assertions that verify fence writeback addresses are dword-aligned. These assertions can be reached from unprivileged userspace via crafted DRM_IOCTL_AMDGPU_CS submissions, causing a fatal kernel panic in a scheduler worker thread. Replace both BUG_ON() calls with WARN_ON() to log the condition without crashing the kernel. A misaligned fence address at this point indicates a driver bug, but crashing the kernel is never the correct response when the assertion is reachable from userspace. The CS IOCTL path is the correct place to filter invalid submissions; the ring emission callback is too late to do anything about it. (cherry picked from commit b90250bd933afd1ba94d86d6b13821997b22b18e)
A NULL pointer dereference issue was discovered in the Linux kernel in io_files_update_with_index_alloc. A local user could use this flaw to potentially crash the system causing a denial of service.
In the Linux kernel, the following vulnerability has been resolved: cpufreq: amd-pstate-ut: Fix kernel panic when loading the driver After loading the amd-pstate-ut driver, amd_pstate_ut_check_perf() and amd_pstate_ut_check_freq() use cpufreq_cpu_get() to get the policy of the CPU and mark it as busy. In these functions, cpufreq_cpu_put() should be used to release the policy, but it is not, so any other entity trying to access the policy is blocked indefinitely. One such scenario is when amd_pstate mode is changed, leading to the following splat: [ 1332.103727] INFO: task bash:2929 blocked for more than 120 seconds. [ 1332.110001] Not tainted 6.5.0-rc2-amd-pstate-ut #5 [ 1332.115315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1332.123140] task:bash state:D stack:0 pid:2929 ppid:2873 flags:0x00004006 [ 1332.123143] Call Trace: [ 1332.123145] <TASK> [ 1332.123148] __schedule+0x3c1/0x16a0 [ 1332.123154] ? _raw_read_lock_irqsave+0x2d/0x70 [ 1332.123157] schedule+0x6f/0x110 [ 1332.123160] schedule_timeout+0x14f/0x160 [ 1332.123162] ? preempt_count_add+0x86/0xd0 [ 1332.123165] __wait_for_common+0x92/0x190 [ 1332.123168] ? __pfx_schedule_timeout+0x10/0x10 [ 1332.123170] wait_for_completion+0x28/0x30 [ 1332.123173] cpufreq_policy_put_kobj+0x4d/0x90 [ 1332.123177] cpufreq_policy_free+0x157/0x1d0 [ 1332.123178] ? preempt_count_add+0x58/0xd0 [ 1332.123180] cpufreq_remove_dev+0xb6/0x100 [ 1332.123182] subsys_interface_unregister+0x114/0x120 [ 1332.123185] ? preempt_count_add+0x58/0xd0 [ 1332.123187] ? __pfx_amd_pstate_change_driver_mode+0x10/0x10 [ 1332.123190] cpufreq_unregister_driver+0x3b/0xd0 [ 1332.123192] amd_pstate_change_driver_mode+0x1e/0x50 [ 1332.123194] store_status+0xe9/0x180 [ 1332.123197] dev_attr_store+0x1b/0x30 [ 1332.123199] sysfs_kf_write+0x42/0x50 [ 1332.123202] kernfs_fop_write_iter+0x143/0x1d0 [ 1332.123204] vfs_write+0x2df/0x400 [ 1332.123208] ksys_write+0x6b/0xf0 [ 1332.123210] __x64_sys_write+0x1d/0x30 [ 1332.123213] do_syscall_64+0x60/0x90 [ 1332.123216] ? fpregs_assert_state_consistent+0x2e/0x50 [ 1332.123219] ? exit_to_user_mode_prepare+0x49/0x1a0 [ 1332.123223] ? irqentry_exit_to_user_mode+0xd/0x20 [ 1332.123225] ? irqentry_exit+0x3f/0x50 [ 1332.123226] ? exc_page_fault+0x8e/0x190 [ 1332.123228] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 1332.123232] RIP: 0033:0x7fa74c514a37 [ 1332.123234] RSP: 002b:00007ffe31dd0788 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 1332.123238] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007fa74c514a37 [ 1332.123239] RDX: 0000000000000008 RSI: 000055e27c447aa0 RDI: 0000000000000001 [ 1332.123241] RBP: 000055e27c447aa0 R08: 00007fa74c5d1460 R09: 000000007fffffff [ 1332.123242] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000008 [ 1332.123244] R13: 00007fa74c61a780 R14: 00007fa74c616600 R15: 00007fa74c615a00 [ 1332.123247] </TASK> Fix this by calling cpufreq_cpu_put() wherever necessary. [ rjw: Subject and changelog edits ]
In the Linux kernel, the following vulnerability has been resolved: hwmon: (coretemp) Simplify platform device handling Coretemp's platform driver is unconventional. All the real work is done globally by the initcall and CPU hotplug notifiers, while the "driver" effectively just wraps an allocation and the registration of the hwmon interface in a long-winded round-trip through the driver core. The whole logic of dynamically creating and destroying platform devices to bring the interfaces up and down is error prone, since it assumes platform_device_add() will synchronously bind the driver and set drvdata before it returns, thus results in a NULL dereference if drivers_autoprobe is turned off for the platform bus. Furthermore, the unusual approach of doing that from within a CPU hotplug notifier, already commented in the code that it deadlocks suspend, also causes lockdep issues for other drivers or subsystems which may want to legitimately register a CPU hotplug notifier from a platform bus notifier. All of these issues can be solved by ripping this unusual behaviour out completely, simply tying the platform devices to the lifetime of the module itself, and directly managing the hwmon interfaces from the hotplug notifiers. There is a slight user-visible change in that /sys/bus/platform/drivers/coretemp will no longer appear, and /sys/devices/platform/coretemp.n will remain present if package n is hotplugged off, but hwmon users should really only be looking for the presence of the hwmon interfaces, whose behaviour remains unchanged.
In the Linux kernel, the following vulnerability has been resolved: media: radio-shark: Add endpoint checks The syzbot fuzzer was able to provoke a WARNING from the radio-shark2 driver: ------------[ cut here ]------------ usb 1-1: BOGUS urb xfer, pipe 1 != type 3 WARNING: CPU: 0 PID: 3271 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed2/0x1880 drivers/usb/core/urb.c:504 Modules linked in: CPU: 0 PID: 3271 Comm: kworker/0:3 Not tainted 6.1.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Workqueue: usb_hub_wq hub_event RIP: 0010:usb_submit_urb+0xed2/0x1880 drivers/usb/core/urb.c:504 Code: 7c 24 18 e8 00 36 ea fb 48 8b 7c 24 18 e8 36 1c 02 ff 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 b6 90 8a e8 9a 29 b8 03 <0f> 0b e9 58 f8 ff ff e8 d2 35 ea fb 48 81 c5 c0 05 00 00 e9 84 f7 RSP: 0018:ffffc90003876dd0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff8880750b0040 RSI: ffffffff816152b8 RDI: fffff5200070edac RBP: ffff8880172d81e0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000080000000 R11: 0000000000000000 R12: 0000000000000001 R13: ffff8880285c5040 R14: 0000000000000002 R15: ffff888017158200 FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe03235b90 CR3: 000000000bc8e000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> usb_start_wait_urb+0x101/0x4b0 drivers/usb/core/message.c:58 usb_bulk_msg+0x226/0x550 drivers/usb/core/message.c:387 shark_write_reg+0x1ff/0x2e0 drivers/media/radio/radio-shark2.c:88 ... The problem was caused by the fact that the driver does not check whether the endpoints it uses are actually present and have the appropriate types. This can be fixed by adding a simple check of these endpoints (and similarly for the radio-shark driver).
In the Linux kernel, the following vulnerability has been resolved: x86: fix clear_user_rep_good() exception handling annotation This code no longer exists in mainline, because it was removed in commit d2c95f9d6802 ("x86: don't use REP_GOOD or ERMS for user memory clearing") upstream. However, rather than backport the full range of x86 memory clearing and copying cleanups, fix the exception table annotation placement for the final 'rep movsb' in clear_user_rep_good(): rather than pointing at the actual instruction that did the user space access, it pointed to the register move just before it. That made sense from a code flow standpoint, but not from an actual usage standpoint: it means that if user access takes an exception, the exception handler won't actually find the instruction in the exception tables. As a result, rather than fixing it up and returning -EFAULT, it would then turn it into a kernel oops report instead, something like: BUG: unable to handle page fault for address: 0000000020081000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page ... RIP: 0010:clear_user_rep_good+0x1c/0x30 arch/x86/lib/clear_page_64.S:147 ... Call Trace: __clear_user arch/x86/include/asm/uaccess_64.h:103 [inline] clear_user arch/x86/include/asm/uaccess_64.h:124 [inline] iov_iter_zero+0x709/0x1290 lib/iov_iter.c:800 iomap_dio_hole_iter fs/iomap/direct-io.c:389 [inline] iomap_dio_iter fs/iomap/direct-io.c:440 [inline] __iomap_dio_rw+0xe3d/0x1cd0 fs/iomap/direct-io.c:601 iomap_dio_rw+0x40/0xa0 fs/iomap/direct-io.c:689 ext4_dio_read_iter fs/ext4/file.c:94 [inline] ext4_file_read_iter+0x4be/0x690 fs/ext4/file.c:145 call_read_iter include/linux/fs.h:2183 [inline] do_iter_readv_writev+0x2e0/0x3b0 fs/read_write.c:733 do_iter_read+0x2f2/0x750 fs/read_write.c:796 vfs_readv+0xe5/0x150 fs/read_write.c:916 do_preadv+0x1b6/0x270 fs/read_write.c:1008 __do_sys_preadv2 fs/read_write.c:1070 [inline] __se_sys_preadv2 fs/read_write.c:1061 [inline] __x64_sys_preadv2+0xef/0x150 fs/read_write.c:1061 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd which then looks like a filesystem bug rather than the incorrect exception annotation that it is. [ The alternative to this one-liner fix is to take the upstream series that cleans this all up: 68674f94ffc9 ("x86: don't use REP_GOOD or ERMS for small memory copies") 20f3337d350c ("x86: don't use REP_GOOD or ERMS for small memory clearing") adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies") * d2c95f9d6802 ("x86: don't use REP_GOOD or ERMS for user memory clearing") 3639a535587d ("x86: move stac/clac from user copy routines into callers") 577e6a7fd50d ("x86: inline the 'rep movs' in user copies for the FSRM case") 8c9b6a88b7e2 ("x86: improve on the non-rep 'clear_user' function") 427fda2c8a49 ("x86: improve on the non-rep 'copy_user' function") * e046fe5a36a9 ("x86: set FSRS automatically on AMD CPUs that have FSRM") e1f2750edc4a ("x86: remove 'zerorest' argument from __copy_user_nocache()") 034ff37d3407 ("x86: rewrite '__copy_user_nocache' function") with either the whole series or at a minimum the two marked commits being needed to fix this issue ]
In the Linux kernel, the following vulnerability has been resolved: net/handshake: fix null-ptr-deref in handshake_nl_done_doit() We should not call trace_handshake_cmd_done_err() if socket lookup has failed. Also we should call trace_handshake_cmd_done_err() before releasing the file, otherwise dereferencing sock->sk can return garbage. This also reverts 7afc6d0a107f ("net/handshake: Fix uninitialized local variable") Unable to handle kernel paging request at virtual address dfff800000000003 KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [dfff800000000003] address between user and kernel address ranges Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 5986 Comm: syz-executor292 Not tainted 6.5.0-rc7-syzkaller-gfe4469582053 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : handshake_nl_done_doit+0x198/0x9c8 net/handshake/netlink.c:193 lr : handshake_nl_done_doit+0x180/0x9c8 sp : ffff800096e37180 x29: ffff800096e37200 x28: 1ffff00012dc6e34 x27: dfff800000000000 x26: ffff800096e373d0 x25: 0000000000000000 x24: 00000000ffffffa8 x23: ffff800096e373f0 x22: 1ffff00012dc6e38 x21: 0000000000000000 x20: ffff800096e371c0 x19: 0000000000000018 x18: 0000000000000000 x17: 0000000000000000 x16: ffff800080516cc4 x15: 0000000000000001 x14: 1fffe0001b14aa3b x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000003 x8 : 0000000000000003 x7 : ffff800080afe47c x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800080a88078 x2 : 0000000000000001 x1 : 00000000ffffffa8 x0 : 0000000000000000 Call trace: handshake_nl_done_doit+0x198/0x9c8 net/handshake/netlink.c:193 genl_family_rcv_msg_doit net/netlink/genetlink.c:970 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1050 [inline] genl_rcv_msg+0x96c/0xc50 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2549 genl_rcv+0x38/0x50 net/netlink/genetlink.c:1078 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x834/0xb18 net/netlink/af_netlink.c:1914 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x56c/0x840 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmsg+0x26c/0x33c net/socket.c:2577 __do_sys_sendmsg net/socket.c:2586 [inline] __se_sys_sendmsg net/socket.c:2584 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2584 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155 el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: 12800108 b90043e8 910062b3 d343fe68 (387b6908)
IBM Spectrum Protect Plus 10.1.0 through 10.1.8 could allow a local user to cause a denial of service due to insecure file permission settings. IBM X-Force ID: 197791.
A vulnerability was found due to missing lock for IOPOLL flaw in io_cqring_event_overflow() in io_uring.c in Linux Kernel. This flaw allows a local attacker with user privilege to trigger a Denial of Service threat.
In the Linux kernel, the following vulnerability has been resolved: netfilter: conntrack: fix wrong ct->timeout value (struct nf_conn)->timeout is an interval before the conntrack confirmed. After confirmed, it becomes a timestamp. It is observed that timeout of an unconfirmed conntrack: - Set by calling ctnetlink_change_timeout(). As a result, `nfct_time_stamp` was wrongly added to `ct->timeout` twice. - Get by calling ctnetlink_dump_timeout(). As a result, `nfct_time_stamp` was wrongly subtracted. Call Trace: <TASK> dump_stack_lvl ctnetlink_dump_timeout __ctnetlink_glue_build ctnetlink_glue_build __nfqnl_enqueue_packet nf_queue nf_hook_slow ip_mc_output ? __pfx_ip_finish_output ip_send_skb ? __pfx_dst_output udp_send_skb udp_sendmsg ? __pfx_ip_generic_getfrag sock_sendmsg Separate the 2 cases in: - Setting `ct->timeout` in __nf_ct_set_timeout(). - Getting `ct->timeout` in ctnetlink_dump_timeout(). Pablo appends: Update ctnetlink to set up the timeout _after_ the IPS_CONFIRMED flag is set on, otherwise conntrack creation via ctnetlink breaks. Note that the problem described in this patch occurs since the introduction of the nfnetlink_queue conntrack support, select a sufficiently old Fixes: tag for -stable kernel to pick up this fix.
In the Linux kernel, the following vulnerability has been resolved: driver core: fix resource leak in device_add() When calling kobject_add() failed in device_add(), it will call cleanup_glue_dir() to free resource. But in kobject_add(), dev->kobj.parent has been set to NULL. This will cause resource leak. The process is as follows: device_add() get_device_parent() class_dir_create_and_add() kobject_add() //kobject_get() ... dev->kobj.parent = kobj; ... kobject_add() //failed, but set dev->kobj.parent = NULL ... glue_dir = get_glue_dir(dev) //glue_dir = NULL, and goto //"Error" label ... cleanup_glue_dir() //becaues glue_dir is NULL, not call //kobject_put() The preceding problem may cause insmod mac80211_hwsim.ko to failed. sysfs: cannot create duplicate filename '/devices/virtual/mac80211_hwsim' Call Trace: <TASK> dump_stack_lvl+0x8e/0xd1 sysfs_warn_dup.cold+0x1c/0x29 sysfs_create_dir_ns+0x224/0x280 kobject_add_internal+0x2aa/0x880 kobject_add+0x135/0x1a0 get_device_parent+0x3d7/0x590 device_add+0x2aa/0x1cb0 device_create_groups_vargs+0x1eb/0x260 device_create+0xdc/0x110 mac80211_hwsim_new_radio+0x31e/0x4790 [mac80211_hwsim] init_mac80211_hwsim+0x48d/0x1000 [mac80211_hwsim] do_one_initcall+0x10f/0x630 do_init_module+0x19f/0x5e0 load_module+0x64b7/0x6eb0 __do_sys_finit_module+0x140/0x200 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> kobject_add_internal failed for mac80211_hwsim with -EEXIST, don't try to register things with the same name in the same directory.
In the Linux kernel, the following vulnerability has been resolved: bpf: Fail verification for sign-extension of packet data/data_end/data_meta syzbot reported a kernel crash due to commit 1f1e864b6555 ("bpf: Handle sign-extenstin ctx member accesses"). The reason is due to sign-extension of 32-bit load for packet data/data_end/data_meta uapi field. The original code looks like: r2 = *(s32 *)(r1 + 76) /* load __sk_buff->data */ r3 = *(u32 *)(r1 + 80) /* load __sk_buff->data_end */ r0 = r2 r0 += 8 if r3 > r0 goto +1 ... Note that __sk_buff->data load has 32-bit sign extension. After verification and convert_ctx_accesses(), the final asm code looks like: r2 = *(u64 *)(r1 +208) r2 = (s32)r2 r3 = *(u64 *)(r1 +80) r0 = r2 r0 += 8 if r3 > r0 goto pc+1 ... Note that 'r2 = (s32)r2' may make the kernel __sk_buff->data address invalid which may cause runtime failure. Currently, in C code, typically we have void *data = (void *)(long)skb->data; void *data_end = (void *)(long)skb->data_end; ... and it will generate r2 = *(u64 *)(r1 +208) r3 = *(u64 *)(r1 +80) r0 = r2 r0 += 8 if r3 > r0 goto pc+1 If we allow sign-extension, void *data = (void *)(long)(int)skb->data; void *data_end = (void *)(long)skb->data_end; ... the generated code looks like r2 = *(u64 *)(r1 +208) r2 <<= 32 r2 s>>= 32 r3 = *(u64 *)(r1 +80) r0 = r2 r0 += 8 if r3 > r0 goto pc+1 and this will cause verification failure since "r2 <<= 32" is not allowed as "r2" is a packet pointer. To fix this issue for case r2 = *(s32 *)(r1 + 76) /* load __sk_buff->data */ this patch added additional checking in is_valid_access() callback function for packet data/data_end/data_meta access. If those accesses are with sign-extenstion, the verification will fail. [1] https://lore.kernel.org/bpf/000000000000c90eee061d236d37@google.com/
In the Linux kernel, the following vulnerability has been resolved: arm64: kexec: initialize kexec_buf struct in load_other_segments() Patch series "kexec: Fix invalid field access". The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used. An initial fix was already landed for arm64[0], and this patchset fixes the problem on the remaining arm64 code and on riscv, as raised by Mark. Discussions about this problem could be found at[1][2]. This patch (of 3): The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used.
In the Linux kernel before 6.1.2, kernel/module/decompress.c misinterprets the module_get_next_page return value (expects it to be NULL in the error case, whereas it is actually an error pointer).
In the Linux kernel before 5.17.2, drivers/soc/qcom/qcom_aoss.c does not release an of_find_device_by_node reference after use, e.g., with put_device.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix missing last_unlink_trans update when removing a directory When removing a directory we are not updating its last_unlink_trans field, which can result in incorrect fsync behaviour in case some one fsyncs the directory after it was removed because it's holding a file descriptor on it. Example scenario: mkdir /mnt/dir1 mkdir /mnt/dir1/dir2 mkdir /mnt/dir3 sync -f /mnt # Do some change to the directory and fsync it. chmod 700 /mnt/dir1 xfs_io -c fsync /mnt/dir1 # Move dir2 out of dir1 so that dir1 becomes empty. mv /mnt/dir1/dir2 /mnt/dir3/ open fd on /mnt/dir1 call rmdir(2) on path "/mnt/dir1" fsync fd <trigger power failure> When attempting to mount the filesystem, the log replay will fail with an -EIO error and dmesg/syslog has the following: [445771.626482] BTRFS info (device dm-0): first mount of filesystem 0368bbea-6c5e-44b5-b409-09abe496e650 [445771.626486] BTRFS info (device dm-0): using crc32c checksum algorithm [445771.627912] BTRFS info (device dm-0): start tree-log replay [445771.628335] page: refcount:2 mapcount:0 mapping:0000000061443ddc index:0x1d00 pfn:0x7072a5 [445771.629453] memcg:ffff89f400351b00 [445771.629892] aops:btree_aops [btrfs] ino:1 [445771.630737] flags: 0x17fffc00000402a(uptodate|lru|private|writeback|node=0|zone=2|lastcpupid=0x1ffff) [445771.632359] raw: 017fffc00000402a fffff47284d950c8 fffff472907b7c08 ffff89f458e412b8 [445771.633713] raw: 0000000000001d00 ffff89f6c51d1a90 00000002ffffffff ffff89f400351b00 [445771.635029] page dumped because: eb page dump [445771.635825] BTRFS critical (device dm-0): corrupt leaf: root=5 block=30408704 slot=10 ino=258, invalid nlink: has 2 expect no more than 1 for dir [445771.638088] BTRFS info (device dm-0): leaf 30408704 gen 10 total ptrs 17 free space 14878 owner 5 [445771.638091] BTRFS info (device dm-0): refs 4 lock_owner 0 current 3581087 [445771.638094] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 [445771.638097] inode generation 3 transid 9 size 16 nbytes 16384 [445771.638098] block group 0 mode 40755 links 1 uid 0 gid 0 [445771.638100] rdev 0 sequence 2 flags 0x0 [445771.638102] atime 1775744884.0 [445771.660056] ctime 1775744885.645502983 [445771.660058] mtime 1775744885.645502983 [445771.660060] otime 1775744884.0 [445771.660062] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 [445771.660064] index 0 name_len 2 [445771.660066] item 2 key (256 DIR_ITEM 1843588421) itemoff 16077 itemsize 34 [445771.660068] location key (259 1 0) type 2 [445771.660070] transid 9 data_len 0 name_len 4 [445771.660075] item 3 key (256 DIR_ITEM 2363071922) itemoff 16043 itemsize 34 [445771.660076] location key (257 1 0) type 2 [445771.660077] transid 9 data_len 0 name_len 4 [445771.660078] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34 [445771.660079] location key (257 1 0) type 2 [445771.660080] transid 9 data_len 0 name_len 4 [445771.660081] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34 [445771.660082] location key (259 1 0) type 2 [445771.660083] transid 9 data_len 0 name_len 4 [445771.660084] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160 [445771.660086] inode generation 9 transid 9 size 8 nbytes 0 [445771.660087] block group 0 mode 40777 links 1 uid 0 gid 0 [445771.660088] rdev 0 sequence 2 flags 0x0 [445771.660089] atime 1775744885.641174097 [445771.660090] ctime 1775744885.645502983 [445771.660091] mtime 1775744885.645502983 [445771.660105] otime 1775744885.641174097 [445771.660106] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14 [445771.660107] index 2 name_len 4 [445771.660108] item 8 key (257 DIR_ITEM 2676584006) itemoff 15767 itemsize 34 [445771.660109] location key (2 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: drm/i915: Fix memory leaks in i915 selftests This patch fixes memory leaks on error escapes in function fake_get_pages (cherry picked from commit 8bfbdadce85c4c51689da10f39c805a7106d4567)
In the Linux kernel, the following vulnerability has been resolved: wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop() Since '__dev_queue_xmit()' should be called with interrupts enabled, the following backtrace: ieee80211_do_stop() ... spin_lock_irqsave(&local->queue_stop_reason_lock, flags) ... ieee80211_free_txskb() ieee80211_report_used_skb() ieee80211_report_ack_skb() cfg80211_mgmt_tx_status_ext() nl80211_frame_tx_status() genlmsg_multicast_netns() genlmsg_multicast_netns_filtered() nlmsg_multicast_filtered() netlink_broadcast_filtered() do_one_broadcast() netlink_broadcast_deliver() __netlink_sendskb() netlink_deliver_tap() __netlink_deliver_tap_skb() dev_queue_xmit() __dev_queue_xmit() ; with IRQS disabled ... spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags) issues the warning (as reported by syzbot reproducer): WARNING: CPU: 2 PID: 5128 at kernel/softirq.c:362 __local_bh_enable_ip+0xc3/0x120 Fix this by implementing a two-phase skb reclamation in 'ieee80211_do_stop()', where actual work is performed outside of a section with interrupts disabled.
The Linux kernel from v2.3.36 before v2.6.39 allows local unprivileged users to cause a denial of service (memory consumption) by triggering creation of PTE pages.
In the Linux kernel, the following vulnerability has been resolved: cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc3fef. But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused: 1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted. The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow. Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region.
In the Linux kernel, the following vulnerability has been resolved: f2fs: split initial and dynamic conditions for extent_cache Let's allocate the extent_cache tree without dynamic conditions to avoid a missing condition causing a panic as below. # create a file w/ a compressed flag # disable the compression # panic while updating extent_cache F2FS-fs (dm-64): Swapfile: last extent is not aligned to section F2FS-fs (dm-64): Swapfile (3) is not align to section: 1) creat(), 2) ioctl(F2FS_IOC_SET_PIN_FILE), 3) fallocate(2097152 * N) Adding 124996k swap on ./swap-file. Priority:0 extents:2 across:17179494468k ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline] BUG: KASAN: null-ptr-deref in atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline] BUG: KASAN: null-ptr-deref in queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline] BUG: KASAN: null-ptr-deref in __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline] BUG: KASAN: null-ptr-deref in _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295 Write of size 4 at addr 0000000000000030 by task syz-executor154/3327 CPU: 0 PID: 3327 Comm: syz-executor154 Tainted: G O 5.10.185 #1 Hardware name: emulation qemu-x86/qemu-x86, BIOS 2023.01-21885-gb3cc1cd24d 01/01/2023 Call Trace: __dump_stack out/common/lib/dump_stack.c:77 [inline] dump_stack_lvl+0x17e/0x1c4 out/common/lib/dump_stack.c:118 __kasan_report+0x16c/0x260 out/common/mm/kasan/report.c:415 kasan_report+0x51/0x70 out/common/mm/kasan/report.c:428 kasan_check_range+0x2f3/0x340 out/common/mm/kasan/generic.c:186 __kasan_check_write+0x14/0x20 out/common/mm/kasan/shadow.c:37 instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline] atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline] queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline] __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline] _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295 __drop_extent_tree+0xdf/0x2f0 out/common/fs/f2fs/extent_cache.c:1155 f2fs_drop_extent_tree+0x17/0x30 out/common/fs/f2fs/extent_cache.c:1172 f2fs_insert_range out/common/fs/f2fs/file.c:1600 [inline] f2fs_fallocate+0x19fd/0x1f40 out/common/fs/f2fs/file.c:1764 vfs_fallocate+0x514/0x9b0 out/common/fs/open.c:310 ksys_fallocate out/common/fs/open.c:333 [inline] __do_sys_fallocate out/common/fs/open.c:341 [inline] __se_sys_fallocate out/common/fs/open.c:339 [inline] __x64_sys_fallocate+0xb8/0x100 out/common/fs/open.c:339 do_syscall_64+0x35/0x50 out/common/arch/x86/entry/common.c:46
In the Linux kernel, the following vulnerability has been resolved: riscv: kprobe: Fixup kernel panic when probing an illegal position The kernel would panic when probed for an illegal position. eg: (CONFIG_RISCV_ISA_C=n) echo 'p:hello kernel_clone+0x16 a0=%a0' >> kprobe_events echo 1 > events/kprobes/hello/enable cat trace Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __do_sys_newfstatat+0xb8/0xb8 CPU: 0 PID: 111 Comm: sh Not tainted 6.2.0-rc1-00027-g2d398fe49a4d #490 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80007268>] dump_backtrace+0x38/0x48 [<ffffffff80c5e83c>] show_stack+0x50/0x68 [<ffffffff80c6da28>] dump_stack_lvl+0x60/0x84 [<ffffffff80c6da6c>] dump_stack+0x20/0x30 [<ffffffff80c5ecf4>] panic+0x160/0x374 [<ffffffff80c6db94>] generic_handle_arch_irq+0x0/0xa8 [<ffffffff802deeb0>] sys_newstat+0x0/0x30 [<ffffffff800158c0>] sys_clone+0x20/0x30 [<ffffffff800039e8>] ret_from_syscall+0x0/0x4 ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __do_sys_newfstatat+0xb8/0xb8 ]--- That is because the kprobe's ebreak instruction broke the kernel's original code. The user should guarantee the correction of the probe position, but it couldn't make the kernel panic. This patch adds arch_check_kprobe in arch_prepare_kprobe to prevent an illegal position (Such as the middle of an instruction).
In the Linux kernel, the following vulnerability has been resolved: iavf: fix hang on reboot with ice When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown().
In the Linux kernel, the following vulnerability has been resolved: netfilter: nf_tables: adapt set backend to use GC transaction API Use the GC transaction API to replace the old and buggy gc API and the busy mark approach. No set elements are removed from async garbage collection anymore, instead the _DEAD bit is set on so the set element is not visible from lookup path anymore. Async GC enqueues transaction work that might be aborted and retried later. rbtree and pipapo set backends does not set on the _DEAD bit from the sync GC path since this runs in control plane path where mutex is held. In this case, set elements are deactivated, removed and then released via RCU callback, sync GC never fails.
In the Linux kernel, the following vulnerability has been resolved: btrfs: free exchange changeset on failures Fstests runs on my VMs have show several kmemleak reports like the following. unreferenced object 0xffff88811ae59080 (size 64): comm "xfs_io", pid 12124, jiffies 4294987392 (age 6.368s) hex dump (first 32 bytes): 00 c0 1c 00 00 00 00 00 ff cf 1c 00 00 00 00 00 ................ 90 97 e5 1a 81 88 ff ff 90 97 e5 1a 81 88 ff ff ................ backtrace: [<00000000ac0176d2>] ulist_add_merge+0x60/0x150 [btrfs] [<0000000076e9f312>] set_state_bits+0x86/0xc0 [btrfs] [<0000000014fe73d6>] set_extent_bit+0x270/0x690 [btrfs] [<000000004f675208>] set_record_extent_bits+0x19/0x20 [btrfs] [<00000000b96137b1>] qgroup_reserve_data+0x274/0x310 [btrfs] [<0000000057e9dcbb>] btrfs_check_data_free_space+0x5c/0xa0 [btrfs] [<0000000019c4511d>] btrfs_delalloc_reserve_space+0x1b/0xa0 [btrfs] [<000000006d37e007>] btrfs_dio_iomap_begin+0x415/0x970 [btrfs] [<00000000fb8a74b8>] iomap_iter+0x161/0x1e0 [<0000000071dff6ff>] __iomap_dio_rw+0x1df/0x700 [<000000002567ba53>] iomap_dio_rw+0x5/0x20 [<0000000072e555f8>] btrfs_file_write_iter+0x290/0x530 [btrfs] [<000000005eb3d845>] new_sync_write+0x106/0x180 [<000000003fb505bf>] vfs_write+0x24d/0x2f0 [<000000009bb57d37>] __x64_sys_pwrite64+0x69/0xa0 [<000000003eba3fdf>] do_syscall_64+0x43/0x90 In case brtfs_qgroup_reserve_data() or btrfs_delalloc_reserve_metadata() fail the allocated extent_changeset will not be freed. So in btrfs_check_data_free_space() and btrfs_delalloc_reserve_space() free the allocated extent_changeset to get rid of the allocated memory. The issue currently only happens in the direct IO write path, but only after 65b3c08606e5 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range"), and also at defrag_one_locked_target(). Every other place is always calling extent_changeset_free() even if its call to btrfs_delalloc_reserve_space() or btrfs_check_data_free_space() has failed.
atm_tc_enqueue in net/sched/sch_atm.c in the Linux kernel through 6.1.4 allows attackers to cause a denial of service because of type confusion (non-negative numbers can sometimes indicate a TC_ACT_SHOT condition rather than valid classification results).
In the Linux kernel before 6.0.3, drivers/gpu/drm/virtio/virtgpu_object.c misinterprets the drm_gem_shmem_get_sg_table return value (expects it to be NULL in the error case, whereas it is actually an error pointer).
In the Linux kernel, the following vulnerability has been resolved: bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails syzbot reported a warning[1] where the bond device itself is a slave and we try to enslave a non-ethernet device as the first slave which fails but then in the error path when ether_setup() restores the bond device it also clears all flags. In my previous fix[2] I restored the IFF_MASTER flag, but I didn't consider the case that the bond device itself might also be a slave with IFF_SLAVE set, so we need to restore that flag as well. Use the bond_ether_setup helper which does the right thing and restores the bond's flags properly. Steps to reproduce using a nlmon dev: $ ip l add nlmon0 type nlmon $ ip l add bond1 type bond $ ip l add bond2 type bond $ ip l set bond1 master bond2 $ ip l set dev nlmon0 master bond1 $ ip -d l sh dev bond1 22: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue master bond2 state DOWN mode DEFAULT group default qlen 1000 (now bond1's IFF_SLAVE flag is gone and we'll hit a warning[3] if we try to delete it) [1] https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef [2] commit 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure") [3] example warning: [ 27.008664] bond1: (slave nlmon0): The slave device specified does not support setting the MAC address [ 27.008692] bond1: (slave nlmon0): Error -95 calling set_mac_address [ 32.464639] bond1 (unregistering): Released all slaves [ 32.464685] ------------[ cut here ]------------ [ 32.464686] WARNING: CPU: 1 PID: 2004 at net/core/dev.c:10829 unregister_netdevice_many+0x72a/0x780 [ 32.464694] Modules linked in: br_netfilter bridge bonding virtio_net [ 32.464699] CPU: 1 PID: 2004 Comm: ip Kdump: loaded Not tainted 5.18.0-rc3+ #47 [ 32.464703] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014 [ 32.464704] RIP: 0010:unregister_netdevice_many+0x72a/0x780 [ 32.464707] Code: 99 fd ff ff ba 90 1a 00 00 48 c7 c6 f4 02 66 96 48 c7 c7 20 4d 35 96 c6 05 fa c7 2b 02 01 e8 be 6f 4a 00 0f 0b e9 73 fd ff ff <0f> 0b e9 5f fd ff ff 80 3d e3 c7 2b 02 00 0f 85 3b fd ff ff ba 59 [ 32.464710] RSP: 0018:ffffa006422d7820 EFLAGS: 00010206 [ 32.464712] RAX: ffff8f6e077140a0 RBX: ffffa006422d7888 RCX: 0000000000000000 [ 32.464714] RDX: ffff8f6e12edbe58 RSI: 0000000000000296 RDI: ffffffff96d4a520 [ 32.464716] RBP: ffff8f6e07714000 R08: ffffffff96d63600 R09: ffffa006422d7728 [ 32.464717] R10: 0000000000000ec0 R11: ffffffff9698c988 R12: ffff8f6e12edb140 [ 32.464719] R13: dead000000000122 R14: dead000000000100 R15: ffff8f6e12edb140 [ 32.464723] FS: 00007f297c2f1740(0000) GS:ffff8f6e5d900000(0000) knlGS:0000000000000000 [ 32.464725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.464726] CR2: 00007f297bf1c800 CR3: 00000000115e8000 CR4: 0000000000350ee0 [ 32.464730] Call Trace: [ 32.464763] <TASK> [ 32.464767] rtnl_dellink+0x13e/0x380 [ 32.464776] ? cred_has_capability.isra.0+0x68/0x100 [ 32.464780] ? __rtnl_unlock+0x33/0x60 [ 32.464783] ? bpf_lsm_capset+0x10/0x10 [ 32.464786] ? security_capable+0x36/0x50 [ 32.464790] rtnetlink_rcv_msg+0x14e/0x3b0 [ 32.464792] ? _copy_to_iter+0xb1/0x790 [ 32.464796] ? post_alloc_hook+0xa0/0x160 [ 32.464799] ? rtnl_calcit.isra.0+0x110/0x110 [ 32.464802] netlink_rcv_skb+0x50/0xf0 [ 32.464806] netlink_unicast+0x216/0x340 [ 32.464809] netlink_sendmsg+0x23f/0x480 [ 32.464812] sock_sendmsg+0x5e/0x60 [ 32.464815] ____sys_sendmsg+0x22c/0x270 [ 32.464818] ? import_iovec+0x17/0x20 [ 32.464821] ? sendmsg_copy_msghdr+0x59/0x90 [ 32.464823] ? do_set_pte+0xa0/0xe0 [ 32.464828] ___sys_sendmsg+0x81/0xc0 [ 32.464832] ? mod_objcg_state+0xc6/0x300 [ 32.464835] ? refill_obj_stock+0xa9/0x160 [ 32.464838] ? memcg_slab_free_hook+0x1a5/0x1f0 [ 32.464842] __sys_sendm ---truncated---
In the Linux kernel, the following vulnerability has been resolved: net: enetc: avoid deadlock in enetc_tx_onestep_tstamp() This lockdep splat says it better than I could: ================================ WARNING: inconsistent lock state 6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5c/0xc0 {IN-SOFTIRQ-W} state was registered at: _raw_spin_lock+0x5c/0xc0 sch_direct_xmit+0x148/0x37c __dev_queue_xmit+0x528/0x111c ip6_finish_output2+0x5ec/0xb7c ip6_finish_output+0x240/0x3f0 ip6_output+0x78/0x360 ndisc_send_skb+0x33c/0x85c ndisc_send_rs+0x54/0x12c addrconf_rs_timer+0x154/0x260 call_timer_fn+0xb8/0x3a0 __run_timers.part.0+0x214/0x26c run_timer_softirq+0x3c/0x74 __do_softirq+0x14c/0x5d8 ____do_softirq+0x10/0x20 call_on_irq_stack+0x2c/0x5c do_softirq_own_stack+0x1c/0x30 __irq_exit_rcu+0x168/0x1a0 irq_exit_rcu+0x10/0x40 el1_interrupt+0x38/0x64 irq event stamp: 7825 hardirqs last enabled at (7825): [<ffffdf1f7200cae4>] exit_to_kernel_mode+0x34/0x130 hardirqs last disabled at (7823): [<ffffdf1f708105f0>] __do_softirq+0x550/0x5d8 softirqs last enabled at (7824): [<ffffdf1f7081050c>] __do_softirq+0x46c/0x5d8 softirqs last disabled at (7811): [<ffffdf1f708166e0>] ____do_softirq+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(_xmit_ETHER#2); <Interrupt> lock(_xmit_ETHER#2); *** DEADLOCK *** 3 locks held by kworker/1:3/179: #0: ffff3ec400004748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #1: ffff80000a0bbdc8 ((work_completion)(&priv->tx_onestep_tstamp)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #2: ffff3ec4036cd438 (&dev->tx_global_lock){+.+.}-{3:3}, at: netif_tx_lock+0x1c/0x34 Workqueue: events enetc_tx_onestep_tstamp Call trace: print_usage_bug.part.0+0x208/0x22c mark_lock+0x7f0/0x8b0 __lock_acquire+0x7c4/0x1ce0 lock_acquire.part.0+0xe0/0x220 lock_acquire+0x68/0x84 _raw_spin_lock+0x5c/0xc0 netif_freeze_queues+0x5c/0xc0 netif_tx_lock+0x24/0x34 enetc_tx_onestep_tstamp+0x20/0x100 process_one_work+0x28c/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x11c but I'll say it anyway: the enetc_tx_onestep_tstamp() work item runs in process context, therefore with softirqs enabled (i.o.w., it can be interrupted by a softirq). If we hold the netif_tx_lock() when there is an interrupt, and the NET_TX softirq then gets scheduled, this will take the netif_tx_lock() a second time and deadlock the kernel. To solve this, use netif_tx_lock_bh(), which blocks softirqs from running.
In the Linux kernel, the following vulnerability has been resolved: ACPICA: fix acpi parse and parseext cache leaks ACPICA commit 8829e70e1360c81e7a5a901b5d4f48330e021ea5 I'm Seunghun Han, and I work for National Security Research Institute of South Korea. I have been doing a research on ACPI and found an ACPI cache leak in ACPI early abort cases. Boot log of ACPI cache leak is as follows: [ 0.352414] ACPI: Added _OSI(Module Device) [ 0.353182] ACPI: Added _OSI(Processor Device) [ 0.353182] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.353182] ACPI: Added _OSI(Processor Aggregator Device) [ 0.356028] ACPI: Unable to start the ACPI Interpreter [ 0.356799] ACPI Error: Could not remove SCI handler (20170303/evmisc-281) [ 0.360215] kmem_cache_destroy Acpi-State: Slab cache still has objects [ 0.360648] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.12.0-rc4-next-20170608+ #10 [ 0.361273] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS virtual_box 12/01/2006 [ 0.361873] Call Trace: [ 0.362243] ? dump_stack+0x5c/0x81 [ 0.362591] ? kmem_cache_destroy+0x1aa/0x1c0 [ 0.362944] ? acpi_sleep_proc_init+0x27/0x27 [ 0.363296] ? acpi_os_delete_cache+0xa/0x10 [ 0.363646] ? acpi_ut_delete_caches+0x6d/0x7b [ 0.364000] ? acpi_terminate+0xa/0x14 [ 0.364000] ? acpi_init+0x2af/0x34f [ 0.364000] ? __class_create+0x4c/0x80 [ 0.364000] ? video_setup+0x7f/0x7f [ 0.364000] ? acpi_sleep_proc_init+0x27/0x27 [ 0.364000] ? do_one_initcall+0x4e/0x1a0 [ 0.364000] ? kernel_init_freeable+0x189/0x20a [ 0.364000] ? rest_init+0xc0/0xc0 [ 0.364000] ? kernel_init+0xa/0x100 [ 0.364000] ? ret_from_fork+0x25/0x30 I analyzed this memory leak in detail. I found that “Acpi-State” cache and “Acpi-Parse” cache were merged because the size of cache objects was same slab cache size. I finally found “Acpi-Parse” cache and “Acpi-parse_ext” cache were leaked using SLAB_NEVER_MERGE flag in kmem_cache_create() function. Real ACPI cache leak point is as follows: [ 0.360101] ACPI: Added _OSI(Module Device) [ 0.360101] ACPI: Added _OSI(Processor Device) [ 0.360101] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.361043] ACPI: Added _OSI(Processor Aggregator Device) [ 0.364016] ACPI: Unable to start the ACPI Interpreter [ 0.365061] ACPI Error: Could not remove SCI handler (20170303/evmisc-281) [ 0.368174] kmem_cache_destroy Acpi-Parse: Slab cache still has objects [ 0.369332] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 4.12.0-rc4-next-20170608+ #8 [ 0.371256] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS virtual_box 12/01/2006 [ 0.372000] Call Trace: [ 0.372000] ? dump_stack+0x5c/0x81 [ 0.372000] ? kmem_cache_destroy+0x1aa/0x1c0 [ 0.372000] ? acpi_sleep_proc_init+0x27/0x27 [ 0.372000] ? acpi_os_delete_cache+0xa/0x10 [ 0.372000] ? acpi_ut_delete_caches+0x56/0x7b [ 0.372000] ? acpi_terminate+0xa/0x14 [ 0.372000] ? acpi_init+0x2af/0x34f [ 0.372000] ? __class_create+0x4c/0x80 [ 0.372000] ? video_setup+0x7f/0x7f [ 0.372000] ? acpi_sleep_proc_init+0x27/0x27 [ 0.372000] ? do_one_initcall+0x4e/0x1a0 [ 0.372000] ? kernel_init_freeable+0x189/0x20a [ 0.372000] ? rest_init+0xc0/0xc0 [ 0.372000] ? kernel_init+0xa/0x100 [ 0.372000] ? ret_from_fork+0x25/0x30 [ 0.388039] kmem_cache_destroy Acpi-parse_ext: Slab cache still has objects [ 0.389063] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 4.12.0-rc4-next-20170608+ #8 [ 0.390557] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS virtual_box 12/01/2006 [ 0.392000] Call Trace: [ 0.392000] ? dump_stack+0x5c/0x81 [ 0.392000] ? kmem_cache_destroy+0x1aa/0x1c0 [ 0.392000] ? acpi_sleep_proc_init+0x27/0x27 [ 0.392000] ? acpi_os_delete_cache+0xa/0x10 [ 0.392000] ? acpi_ut_delete_caches+0x6d/0x7b [ 0.392000] ? acpi_terminate+0xa/0x14 [ 0.392000] ? acpi_init+0x2af/0x3 ---truncated---
In the Linux kernel before 5.16.3, drivers/usb/dwc3/dwc3-qcom.c misinterprets the dwc3_qcom_create_urs_usb_platdev return value (expects it to be NULL in the error case, whereas it is actually an error pointer).
In the Linux kernel, the following vulnerability has been resolved: ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping() ila_xlat_nl_cmd_get_mapping() generates an empty skb, triggerring a recent sanity check [1]. Instead, return an error code, so that user space can get it. [1] skb_assert_len WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 skb_assert_len include/linux/skbuff.h:2527 [inline] WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 Modules linked in: CPU: 0 PID: 5923 Comm: syz-executor269 Not tainted 6.2.0-syzkaller-18300-g2ebd1fbb946d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_assert_len include/linux/skbuff.h:2527 [inline] pc : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 lr : skb_assert_len include/linux/skbuff.h:2527 [inline] lr : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 sp : ffff80001e0d6c40 x29: ffff80001e0d6e60 x28: dfff800000000000 x27: ffff0000c86328c0 x26: dfff800000000000 x25: ffff0000c8632990 x24: ffff0000c8632a00 x23: 0000000000000000 x22: 1fffe000190c6542 x21: ffff0000c8632a10 x20: ffff0000c8632a00 x19: ffff80001856e000 x18: ffff80001e0d5fc0 x17: 0000000000000000 x16: ffff80001235d16c x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001 x11: ff80800008353a30 x10: 0000000000000000 x9 : 21567eaf25bfb600 x8 : 21567eaf25bfb600 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff80001e0d6558 x4 : ffff800015c74760 x3 : ffff800008596744 x2 : 0000000000000001 x1 : 0000000100000000 x0 : 000000000000000e Call trace: skb_assert_len include/linux/skbuff.h:2527 [inline] __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 dev_queue_xmit include/linux/netdevice.h:3033 [inline] __netlink_deliver_tap_skb net/netlink/af_netlink.c:307 [inline] __netlink_deliver_tap+0x45c/0x6f8 net/netlink/af_netlink.c:325 netlink_deliver_tap+0xf4/0x174 net/netlink/af_netlink.c:338 __netlink_sendskb net/netlink/af_netlink.c:1283 [inline] netlink_sendskb+0x6c/0x154 net/netlink/af_netlink.c:1292 netlink_unicast+0x334/0x8d4 net/netlink/af_netlink.c:1380 nlmsg_unicast include/net/netlink.h:1099 [inline] genlmsg_unicast include/net/genetlink.h:433 [inline] genlmsg_reply include/net/genetlink.h:443 [inline] ila_xlat_nl_cmd_get_mapping+0x620/0x7d0 net/ipv6/ila/ila_xlat.c:493 genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline] genl_rcv_msg+0x938/0xc1c net/netlink/genetlink.c:1065 netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2574 genl_rcv+0x38/0x50 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x800/0xae0 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x558/0x844 net/socket.c:2479 ___sys_sendmsg net/socket.c:2533 [inline] __sys_sendmsg+0x26c/0x33c net/socket.c:2562 __do_sys_sendmsg net/socket.c:2571 [inline] __se_sys_sendmsg net/socket.c:2569 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2569 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 irq event stamp: 136484 hardirqs last enabled at (136483): [<ffff800008350244>] __up_console_sem+0x60/0xb4 kernel/printk/printk.c:345 hardirqs last disabled at (136484): [<ffff800012358d60>] el1_dbg+0x24/0x80 arch/arm64/kernel/entry-common.c:405 softirqs last enabled at (136418): [<ffff800008020ea8>] softirq_ha ---truncated---
In the Linux kernel, the following vulnerability has been resolved: ice: xsk: disable txq irq before flushing hw ice_qp_dis() intends to stop a given queue pair that is a target of xsk pool attach/detach. One of the steps is to disable interrupts on these queues. It currently is broken in a way that txq irq is turned off *after* HW flush which in turn takes no effect. ice_qp_dis(): -> ice_qvec_dis_irq() --> disable rxq irq --> flush hw -> ice_vsi_stop_tx_ring() -->disable txq irq Below splat can be triggered by following steps: - start xdpsock WITHOUT loading xdp prog - run xdp_rxq_info with XDP_TX action on this interface - start traffic - terminate xdpsock [ 256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 256.319560] #PF: supervisor read access in kernel mode [ 256.324775] #PF: error_code(0x0000) - not-present page [ 256.329994] PGD 0 P4D 0 [ 256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G OE 6.2.0-rc5+ #51 [ 256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice] [ 256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44 [ 256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206 [ 256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f [ 256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80 [ 256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000 [ 256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000 [ 256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600 [ 256.421990] FS: 0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000 [ 256.430207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0 [ 256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 256.457770] PKRU: 55555554 [ 256.460529] Call Trace: [ 256.463015] <TASK> [ 256.465157] ? ice_xmit_zc+0x6e/0x150 [ice] [ 256.469437] ice_napi_poll+0x46d/0x680 [ice] [ 256.473815] ? _raw_spin_unlock_irqrestore+0x1b/0x40 [ 256.478863] __napi_poll+0x29/0x160 [ 256.482409] net_rx_action+0x136/0x260 [ 256.486222] __do_softirq+0xe8/0x2e5 [ 256.489853] ? smpboot_thread_fn+0x2c/0x270 [ 256.494108] run_ksoftirqd+0x2a/0x50 [ 256.497747] smpboot_thread_fn+0x1c1/0x270 [ 256.501907] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 256.506594] kthread+0xea/0x120 [ 256.509785] ? __pfx_kthread+0x10/0x10 [ 256.513597] ret_from_fork+0x29/0x50 [ 256.517238] </TASK> In fact, irqs were not disabled and napi managed to be scheduled and run while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers was already freed. To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also while at it, remove redundant ice_clean_rx_ring() call - this is handled in ice_qp_clean_rings().
In the Linux kernel, the following vulnerability has been resolved: wifi: rtw88: sdio: Honor the host max_req_size in the RX path Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth combo card. The error he observed is identical to what has been fixed in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem. Lukas found that disabling or limiting RX aggregation works around the problem for some time (but does not fully fix it). In the following discussion a few key topics have been discussed which have an impact on this problem: - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller which prevents DMA transfers. Instead all transfers need to go through the controller SRAM which limits transfers to 1536 bytes - rtw88 chips don't split incoming (RX) packets, so if a big packet is received this is forwarded to the host in it's original form - rtw88 chips can do RX aggregation, meaning more multiple incoming packets can be pulled by the host from the card with one MMC/SDIO transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation and BIT_EN_PRE_CALC makes the chip honor the limits more effectively) Use multiple consecutive reads in rtw_sdio_read_port() and limit the number of bytes which are copied by the host from the card in one MMC/SDIO transfer. This allows receiving a buffer that's larger than the hosts max_req_size (number of bytes which can be transferred in one MMC/SDIO transfer). As a result of this the skb_over_panic error is gone as the rtw88 driver is now able to receive more than 1536 bytes from the card (either because the incoming packet is larger than that or because multiple packets have been aggregated). In case of an receive errors (-EILSEQ has been observed by Lukas) we need to drain the remaining data from the card's buffer, otherwise the card will return corrupt data for the next rtw_sdio_read_port() call.
In the Linux kernel, the following vulnerability has been resolved: net/sched: sch_taprio: properly cancel timer from taprio_destroy() There is a comment in qdisc_create() about us not calling ops->reset() in some cases. err_out4: /* * Any broken qdiscs that would require a ops->reset() here? * The qdisc was never in action so it shouldn't be necessary. */ As taprio sets a timer before actually receiving a packet, we need to cancel it from ops->destroy, just in case ops->reset has not been called. syzbot reported: ODEBUG: free active (active state 0) object type: hrtimer hint: advance_sched+0x0/0x9a0 arch/x86/include/asm/atomic64_64.h:22 WARNING: CPU: 0 PID: 8441 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Modules linked in: CPU: 0 PID: 8441 Comm: syz-executor813 Not tainted 5.14.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd e0 d3 e3 89 4c 89 ee 48 c7 c7 e0 c7 e3 89 e8 5b 86 11 05 <0f> 0b 83 05 85 03 92 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffc9000130f330 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff88802baeb880 RSI: ffffffff815d87b5 RDI: fffff52000261e58 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815d25ee R11: 0000000000000000 R12: ffffffff898dd020 R13: ffffffff89e3ce20 R14: ffffffff81653630 R15: dffffc0000000000 FS: 0000000000f0d300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffb64b3e000 CR3: 0000000036557000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __debug_check_no_obj_freed lib/debugobjects.c:987 [inline] debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1018 slab_free_hook mm/slub.c:1603 [inline] slab_free_freelist_hook+0x171/0x240 mm/slub.c:1653 slab_free mm/slub.c:3213 [inline] kfree+0xe4/0x540 mm/slub.c:4267 qdisc_create+0xbcf/0x1320 net/sched/sch_api.c:1299 tc_modify_qdisc+0x4c8/0x1a60 net/sched/sch_api.c:1663 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5571 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2403 ___sys_sendmsg+0xf3/0x170 net/socket.c:2457 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
In the Linux kernel, the following vulnerability has been resolved: watchdog: Fix kmemleak in watchdog_cdev_register kmemleak reports memory leaks in watchdog_dev_register, as follows: unreferenced object 0xffff888116233000 (size 2048): comm ""modprobe"", pid 28147, jiffies 4353426116 (age 61.741s) hex dump (first 32 bytes): 80 fa b9 05 81 88 ff ff 08 30 23 16 81 88 ff ff .........0#..... 08 30 23 16 81 88 ff ff 00 00 00 00 00 00 00 00 .0#............. backtrace: [<000000007f001ffd>] __kmem_cache_alloc_node+0x157/0x220 [<000000006a389304>] kmalloc_trace+0x21/0x110 [<000000008d640eea>] watchdog_dev_register+0x4e/0x780 [watchdog] [<0000000053c9f248>] __watchdog_register_device+0x4f0/0x680 [watchdog] [<00000000b2979824>] watchdog_register_device+0xd2/0x110 [watchdog] [<000000001f730178>] 0xffffffffc10880ae [<000000007a1a8bcc>] do_one_initcall+0xcb/0x4d0 [<00000000b98be325>] do_init_module+0x1ca/0x5f0 [<0000000046d08e7c>] load_module+0x6133/0x70f0 ... unreferenced object 0xffff888105b9fa80 (size 16): comm ""modprobe"", pid 28147, jiffies 4353426116 (age 61.741s) hex dump (first 16 bytes): 77 61 74 63 68 64 6f 67 31 00 b9 05 81 88 ff ff watchdog1....... backtrace: [<000000007f001ffd>] __kmem_cache_alloc_node+0x157/0x220 [<00000000486ab89b>] __kmalloc_node_track_caller+0x44/0x1b0 [<000000005a39aab0>] kvasprintf+0xb5/0x140 [<0000000024806f85>] kvasprintf_const+0x55/0x180 [<000000009276cb7f>] kobject_set_name_vargs+0x56/0x150 [<00000000a92e820b>] dev_set_name+0xab/0xe0 [<00000000cec812c6>] watchdog_dev_register+0x285/0x780 [watchdog] [<0000000053c9f248>] __watchdog_register_device+0x4f0/0x680 [watchdog] [<00000000b2979824>] watchdog_register_device+0xd2/0x110 [watchdog] [<000000001f730178>] 0xffffffffc10880ae [<000000007a1a8bcc>] do_one_initcall+0xcb/0x4d0 [<00000000b98be325>] do_init_module+0x1ca/0x5f0 [<0000000046d08e7c>] load_module+0x6133/0x70f0 ... The reason is that put_device is not be called if cdev_device_add fails and wdd->id != 0. watchdog_cdev_register wd_data = kzalloc [1] err = dev_set_name [2] .. err = cdev_device_add if (err) { if (wdd->id == 0) { // wdd->id != 0 .. } return err; // [1],[2] would be leaked To fix it, call put_device in all wdd->id cases.
An issue was discovered in fs/io_uring.c in the Linux kernel through 5.11.8. It allows attackers to cause a denial of service (deadlock) because exit may be waiting to park a SQPOLL thread, but concurrently that SQPOLL thread is waiting for a signal to start, aka CID-3ebba796fa25.
In the Linux kernel, the following vulnerability has been resolved: crypto: stm32/cryp - call finalize with bh disabled The finalize operation in interrupt mode produce a produces a spinlock recursion warning. The reason is the fact that BH must be disabled during this process.
In the Linux kernel, the following vulnerability has been resolved: bcachefs: grab s_umount only if snapshotting When I was testing mongodb over bcachefs with compression, there is a lockdep warning when snapshotting mongodb data volume. $ cat test.sh prog=bcachefs $prog subvolume create /mnt/data $prog subvolume create /mnt/data/snapshots while true;do $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s) sleep 1s done $ cat /etc/mongodb.conf systemLog: destination: file logAppend: true path: /mnt/data/mongod.log storage: dbPath: /mnt/data/ lockdep reports: [ 3437.452330] ====================================================== [ 3437.452750] WARNING: possible circular locking dependency detected [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E [ 3437.453562] ------------------------------------------------------ [ 3437.453981] bcachefs/35533 is trying to acquire lock: [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190 [ 3437.454875] but task is already holding lock: [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.456009] which lock already depends on the new lock. [ 3437.456553] the existing dependency chain (in reverse order) is: [ 3437.457054] -> #3 (&type->s_umount_key#48){.+.+}-{3:3}: [ 3437.457507] down_read+0x3e/0x170 [ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.458206] __x64_sys_ioctl+0x93/0xd0 [ 3437.458498] do_syscall_64+0x42/0xf0 [ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.459155] -> #2 (&c->snapshot_create_lock){++++}-{3:3}: [ 3437.459615] down_read+0x3e/0x170 [ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs] [ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs] [ 3437.460686] notify_change+0x1f1/0x4a0 [ 3437.461283] do_truncate+0x7f/0xd0 [ 3437.461555] path_openat+0xa57/0xce0 [ 3437.461836] do_filp_open+0xb4/0x160 [ 3437.462116] do_sys_openat2+0x91/0xc0 [ 3437.462402] __x64_sys_openat+0x53/0xa0 [ 3437.462701] do_syscall_64+0x42/0xf0 [ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.463359] -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}: [ 3437.463843] down_write+0x3b/0xc0 [ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs] [ 3437.464493] vfs_write+0x21b/0x4c0 [ 3437.464653] ksys_write+0x69/0xf0 [ 3437.464839] do_syscall_64+0x42/0xf0 [ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.465231] -> #0 (sb_writers#10){.+.+}-{0:0}: [ 3437.465471] __lock_acquire+0x1455/0x21b0 [ 3437.465656] lock_acquire+0xc6/0x2b0 [ 3437.465822] mnt_want_write+0x46/0x1a0 [ 3437.465996] filename_create+0x62/0x190 [ 3437.466175] user_path_create+0x2d/0x50 [ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.466617] __x64_sys_ioctl+0x93/0xd0 [ 3437.466791] do_syscall_64+0x42/0xf0 [ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.467180] other info that might help us debug this: [ 3437.469670] 2 locks held by bcachefs/35533: other info that might help us debug this: [ 3437.467507] Chain exists of: sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48 [ 3437.467979] Possible unsafe locking scenario: [ 3437.468223] CPU0 CPU1 [ 3437.468405] ---- ---- [ 3437.468585] rlock(&type->s_umount_key#48); [ 3437.468758] lock(&c->snapshot_create_lock); [ 3437.469030] lock(&type->s_umount_key#48); [ 3437.469291] rlock(sb_writers#10); [ 3437.469434] *** DEADLOCK *** [ 3437.469 ---truncated---
In the Linux kernel before 5.15.13, drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c misinterprets the mlx5_get_uars_page return value (expects it to be NULL in the error case, whereas it is actually an error pointer).
In the Linux kernel, the following vulnerability has been resolved: powerpc/lib: Validate size for vector operations Some of the fp/vmx code in sstep.c assume a certain maximum size for the instructions being emulated. The size of those operations however is determined separately in analyse_instr(). Add a check to validate the assumption on the maximum size of the operations, so as to prevent any unintended kernel stack corruption.
In the Linux kernel, the following vulnerability has been resolved: mptcp: never allow the PM to close a listener subflow Currently, when deleting an endpoint the netlink PM treverses all the local MPTCP sockets, regardless of their status. If an MPTCP listener socket is bound to the IP matching the delete endpoint, the listener TCP socket will be closed. That is unexpected, the PM should only affect data subflows. Additionally, syzbot was able to trigger a NULL ptr dereference due to the above: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 1 PID: 6550 Comm: syz-executor122 Not tainted 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__lock_acquire+0xd7d/0x54a0 kernel/locking/lockdep.c:4897 Code: 0f 0e 41 be 01 00 00 00 0f 86 c8 00 00 00 89 05 69 cc 0f 0e e9 bd 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <80> 3c 02 00 0f 85 f3 2f 00 00 48 81 3b 20 75 17 8f 0f 84 52 f3 ff RSP: 0018:ffffc90001f2f818 EFLAGS: 00010016 RAX: dffffc0000000000 RBX: 0000000000000018 RCX: 0000000000000000 RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 000000000000000a R12: 0000000000000000 R13: ffff88801b98d700 R14: 0000000000000000 R15: 0000000000000001 FS: 00007f177cd3d700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f177cd1b268 CR3: 000000001dd55000 CR4: 0000000000350ee0 Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5637 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5602 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:162 finish_wait+0xc0/0x270 kernel/sched/wait.c:400 inet_csk_wait_for_connect net/ipv4/inet_connection_sock.c:464 [inline] inet_csk_accept+0x7de/0x9d0 net/ipv4/inet_connection_sock.c:497 mptcp_accept+0xe5/0x500 net/mptcp/protocol.c:2865 inet_accept+0xe4/0x7b0 net/ipv4/af_inet.c:739 mptcp_stream_accept+0x2e7/0x10e0 net/mptcp/protocol.c:3345 do_accept+0x382/0x510 net/socket.c:1773 __sys_accept4_file+0x7e/0xe0 net/socket.c:1816 __sys_accept4+0xb0/0x100 net/socket.c:1846 __do_sys_accept net/socket.c:1864 [inline] __se_sys_accept net/socket.c:1861 [inline] __x64_sys_accept+0x71/0xb0 net/socket.c:1861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f177cd8b8e9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f177cd3d308 EFLAGS: 00000246 ORIG_RAX: 000000000000002b RAX: ffffffffffffffda RBX: 00007f177ce13408 RCX: 00007f177cd8b8e9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007f177ce13400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f177ce1340c R13: 00007f177cde1004 R14: 6d705f706374706d R15: 0000000000022000 </TASK> Fix the issue explicitly skipping MPTCP socket in TCP_LISTEN status.
In the Linux kernel, the following vulnerability has been resolved: IB/mlx5: Fix UMR pd cleanup on error flow of driver init The cited commit moves the pd allocation from function mlx5r_umr_resource_cleanup() to a new function mlx5r_umr_cleanup(). So the fix in commit [1] is broken. In error flow, will hit panic [2]. Fix it by checking pd pointer to avoid panic if it is NULL; [1] RDMA/mlx5: Fix UMR cleanup on error flow of driver init [2] [ 347.567063] infiniband mlx5_0: Couldn't register device with driver model [ 347.591382] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 347.593438] #PF: supervisor read access in kernel mode [ 347.595176] #PF: error_code(0x0000) - not-present page [ 347.596962] PGD 0 P4D 0 [ 347.601361] RIP: 0010:ib_dealloc_pd_user+0x12/0xc0 [ib_core] [ 347.604171] RSP: 0018:ffff888106293b10 EFLAGS: 00010282 [ 347.604834] RAX: 0000000000000000 RBX: 000000000000000e RCX: 0000000000000000 [ 347.605672] RDX: ffff888106293ad0 RSI: 0000000000000000 RDI: 0000000000000000 [ 347.606529] RBP: 0000000000000000 R08: ffff888106293ae0 R09: ffff888106293ae0 [ 347.607379] R10: 0000000000000a06 R11: 0000000000000000 R12: 0000000000000000 [ 347.608224] R13: ffffffffa0704dc0 R14: 0000000000000001 R15: 0000000000000001 [ 347.609067] FS: 00007fdc720cd9c0(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000 [ 347.610094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 347.610727] CR2: 0000000000000020 CR3: 0000000103012003 CR4: 0000000000370eb0 [ 347.611421] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 347.612113] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 347.612804] Call Trace: [ 347.613130] <TASK> [ 347.613417] ? __die+0x20/0x60 [ 347.613793] ? page_fault_oops+0x150/0x3e0 [ 347.614243] ? free_msg+0x68/0x80 [mlx5_core] [ 347.614840] ? cmd_exec+0x48f/0x11d0 [mlx5_core] [ 347.615359] ? exc_page_fault+0x74/0x130 [ 347.615808] ? asm_exc_page_fault+0x22/0x30 [ 347.616273] ? ib_dealloc_pd_user+0x12/0xc0 [ib_core] [ 347.616801] mlx5r_umr_cleanup+0x23/0x90 [mlx5_ib] [ 347.617365] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x36/0x40 [mlx5_ib] [ 347.618025] __mlx5_ib_add+0x96/0xd0 [mlx5_ib] [ 347.618539] mlx5r_probe+0xe9/0x310 [mlx5_ib] [ 347.619032] ? kernfs_add_one+0x107/0x150 [ 347.619478] ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib] [ 347.619984] auxiliary_bus_probe+0x3e/0x90 [ 347.620448] really_probe+0xc5/0x3a0 [ 347.620857] __driver_probe_device+0x80/0x160 [ 347.621325] driver_probe_device+0x1e/0x90 [ 347.621770] __driver_attach+0xec/0x1c0 [ 347.622213] ? __device_attach_driver+0x100/0x100 [ 347.622724] bus_for_each_dev+0x71/0xc0 [ 347.623151] bus_add_driver+0xed/0x240 [ 347.623570] driver_register+0x58/0x100 [ 347.623998] __auxiliary_driver_register+0x6a/0xc0 [ 347.624499] ? driver_register+0xae/0x100 [ 347.624940] ? 0xffffffffa0893000 [ 347.625329] mlx5_ib_init+0x16a/0x1e0 [mlx5_ib] [ 347.625845] do_one_initcall+0x4a/0x2a0 [ 347.626273] ? gcov_event+0x2e2/0x3a0 [ 347.626706] do_init_module+0x8a/0x260 [ 347.627126] init_module_from_file+0x8b/0xd0 [ 347.627596] __x64_sys_finit_module+0x1ca/0x2f0 [ 347.628089] do_syscall_64+0x4c/0x100
In the Linux kernel, the following vulnerability has been resolved: RDMA/irdma: Fix potential NULL-ptr-dereference in_dev_get() can return NULL which will cause a failure once idev is dereferenced in in_dev_for_each_ifa_rtnl(). This patch adds a check for NULL value in idev beforehand. Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved: scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read() If kzalloc() fails in lpfc_sli4_cgn_params_read(), then we rely on lpfc_read_object()'s routine to NULL check pdata. Currently, an early return error is thrown from lpfc_read_object() to protect us from NULL ptr dereference, but the errno code is -ENODEV. Change the errno code to a more appropriate -ENOMEM.
In the Linux kernel, the following vulnerability has been resolved: dccp/tcp: Unhash sk from ehash for tb2 alloc failure after check_estalblished(). syzkaller reported a warning [0] in inet_csk_destroy_sock() with no repro. WARN_ON(inet_sk(sk)->inet_num && !inet_csk(sk)->icsk_bind_hash); However, the syzkaller's log hinted that connect() failed just before the warning due to FAULT_INJECTION. [1] When connect() is called for an unbound socket, we search for an available ephemeral port. If a bhash bucket exists for the port, we call __inet_check_established() or __inet6_check_established() to check if the bucket is reusable. If reusable, we add the socket into ehash and set inet_sk(sk)->inet_num. Later, we look up the corresponding bhash2 bucket and try to allocate it if it does not exist. Although it rarely occurs in real use, if the allocation fails, we must revert the changes by check_established(). Otherwise, an unconnected socket could illegally occupy an ehash entry. Note that we do not put tw back into ehash because sk might have already responded to a packet for tw and it would be better to free tw earlier under such memory presure. [0]: WARNING: CPU: 0 PID: 350830 at net/ipv4/inet_connection_sock.c:1193 inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193) Modules linked in: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193) Code: 41 5c 41 5d 41 5e e9 2d 4a 3d fd e8 28 4a 3d fd 48 89 ef e8 f0 cd 7d ff 5b 5d 41 5c 41 5d 41 5e e9 13 4a 3d fd e8 0e 4a 3d fd <0f> 0b e9 61 fe ff ff e8 02 4a 3d fd 4c 89 e7 be 03 00 00 00 e8 05 RSP: 0018:ffffc9000b21fd38 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000009e78 RCX: ffffffff840bae40 RDX: ffff88806e46c600 RSI: ffffffff840bb012 RDI: ffff88811755cca8 RBP: ffff88811755c880 R08: 0000000000000003 R09: 0000000000000000 R10: 0000000000009e78 R11: 0000000000000000 R12: ffff88811755c8e0 R13: ffff88811755c892 R14: ffff88811755c918 R15: 0000000000000000 FS: 00007f03e5243800(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32f21000 CR3: 0000000112ffe001 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> ? inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193) dccp_close (net/dccp/proto.c:1078) inet_release (net/ipv4/af_inet.c:434) __sock_release (net/socket.c:660) sock_close (net/socket.c:1423) __fput (fs/file_table.c:377) __fput_sync (fs/file_table.c:462) __x64_sys_close (fs/open.c:1557 fs/open.c:1539 fs/open.c:1539) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) RIP: 0033:0x7f03e53852bb Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 c9 f5 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 a1 c9 f5 ff 8b 44 RSP: 002b:00000000005dfba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f03e53852bb RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000167c R10: 0000000008a79680 R11: 0000000000000293 R12: 00007f03e4e43000 R13: 00007f03e4e43170 R14: 00007f03e4e43178 R15: 00007f03e4e43170 </TASK> [1]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 0 PID: 350833 Comm: syz-executor.1 Not tainted 6.7.0-12272-g2121c43f88f5 #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153) should_failslab (mm/slub.c:3748) kmem_cache_alloc (mm/slub.c:3763 mm/slub.c:3842 mm/slub.c:3867) inet_bind2_bucket_create ---truncated---
The Linux kernel before 2.6.39 does not properly create transparent huge pages in response to a MAP_PRIVATE mmap system call on /dev/zero, which allows local users to cause a denial of service (system crash) via a crafted application.
In the Linux kernel, the following vulnerability has been resolved: llc: verify mac len before reading mac header LLC reads the mac header with eth_hdr without verifying that the skb has an Ethernet header. Syzbot was able to enter llc_rcv on a tun device. Tun can insert packets without mac len and with user configurable skb->protocol (passing a tun_pi header when not configuring IFF_NO_PI). BUG: KMSAN: uninit-value in llc_station_ac_send_test_r net/llc/llc_station.c:81 [inline] BUG: KMSAN: uninit-value in llc_station_rcv+0x6fb/0x1290 net/llc/llc_station.c:111 llc_station_ac_send_test_r net/llc/llc_station.c:81 [inline] llc_station_rcv+0x6fb/0x1290 net/llc/llc_station.c:111 llc_rcv+0xc5d/0x14a0 net/llc/llc_input.c:218 __netif_receive_skb_one_core net/core/dev.c:5523 [inline] __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5637 netif_receive_skb_internal net/core/dev.c:5723 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5782 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555 tun_get_user+0x54c5/0x69c0 drivers/net/tun.c:2002 Add a mac_len test before all three eth_hdr(skb) calls under net/llc. There are further uses in include/net/llc_pdu.h. All these are protected by a test skb->protocol == ETH_P_802_2. Which does not protect against this tun scenario. But the mac_len test added in this patch in llc_fixup_skb will indirectly protect those too. That is called from llc_rcv before any other LLC code. It is tempting to just add a blanket mac_len check in llc_rcv, but not sure whether that could break valid LLC paths that do not assume an Ethernet header. 802.2 LLC may be used on top of non-802.3 protocols in principle. The below referenced commit shows that used to, on top of Token Ring. At least one of the three eth_hdr uses goes back to before the start of git history. But the one that syzbot exercises is introduced in this commit. That commit is old enough (2008), that effectively all stable kernels should receive this.
In the Linux kernel, the following vulnerability has been resolved: x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL Baoquan reported that after triggering a crash the subsequent crash-kernel fails to boot about half of the time. It triggers a NULL pointer dereference in the periodic tick code. This happens because the legacy timer interrupt (IRQ0) is resent in software which happens in soft interrupt (tasklet) context. In this context get_irq_regs() returns NULL which leads to the NULL pointer dereference. The reason for the resend is a spurious APIC interrupt on the IRQ0 vector which is captured and leads to a resend when the legacy timer interrupt is enabled. This is wrong because the legacy PIC interrupts are level triggered and therefore should never be resent in software, but nothing ever sets the IRQ_LEVEL flag on those interrupts, so the core code does not know about their trigger type. Ensure that IRQ_LEVEL is set when the legacy PCI interrupts are set up.
In the Linux kernel, the following vulnerability has been resolved: blk-cgroup: Reinit blkg_iostat_set after clearing in blkcg_reset_stats() When blkg_alloc() is called to allocate a blkcg_gq structure with the associated blkg_iostat_set's, there are 2 fields within blkg_iostat_set that requires proper initialization - blkg & sync. The former field was introduced by commit 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") while the later one was introduced by commit f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat"). Unfortunately those fields in the blkg_iostat_set's are not properly re-initialized when they are cleared in v1's blkcg_reset_stats(). This can lead to a kernel panic due to NULL pointer access of the blkg pointer. The missing initialization of sync is less problematic and can be a problem in a debug kernel due to missing lockdep initialization. Fix these problems by re-initializing them after memory clearing.