In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove() Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds to rest of destroy operations. mlx5_core_destroy_cq() could be called again by user and cause additional call of mlx5_debug_cq_remove(). cq->dbg was not nullify in previous call and cause the crash. Fix it by nullify cq->dbg pointer after removal. Also proceed to destroy operations only if FW return 0 for MLX5_CMD_OP_DESTROY_CQ command. general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockref_get+0x1/0x60 Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02 00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17 48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48 RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058 RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000 R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0 FS: 00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0 Call Trace: simple_recursive_removal+0x33/0x2e0 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 mlx5_debug_cq_remove+0x32/0x70 [mlx5_core] mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core] devx_obj_cleanup+0x151/0x330 [mlx5_ib] ? __pollwait+0xd0/0xd0 ? xas_load+0x5/0x70 ? xa_load+0x62/0xa0 destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs] uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs] uobj_destroy+0x54/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs] ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs] ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs] __x64_sys_ioctl+0x3e4/0x8e0
In the Linux kernel, the following vulnerability has been resolved: scsi: hisi_sas: Create all dump files during debugfs initialization For the current debugfs of hisi_sas, after user triggers dump, the driver allocate memory space to save the register information and create debugfs files to display the saved information. In this process, the debugfs files created after each dump. Therefore, when the dump is triggered while the driver is unbind, the following hang occurs: [67840.853907] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [67840.862947] Mem abort info: [67840.865855] ESR = 0x0000000096000004 [67840.869713] EC = 0x25: DABT (current EL), IL = 32 bits [67840.875125] SET = 0, FnV = 0 [67840.878291] EA = 0, S1PTW = 0 [67840.881545] FSC = 0x04: level 0 translation fault [67840.886528] Data abort info: [67840.889524] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [67840.895117] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [67840.900284] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [67840.905709] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002803a1f000 [67840.912263] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000 [67840.919177] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [67840.996435] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [67841.003628] pc : down_write+0x30/0x98 [67841.007546] lr : start_creating.part.0+0x60/0x198 [67841.012495] sp : ffff8000b979ba20 [67841.016046] x29: ffff8000b979ba20 x28: 0000000000000010 x27: 0000000000024b40 [67841.023412] x26: 0000000000000012 x25: ffff20202b355ae8 x24: ffff20202b35a8c8 [67841.030779] x23: ffffa36877928208 x22: ffffa368b4972240 x21: ffff8000b979bb18 [67841.038147] x20: ffff00281dc1e3c0 x19: fffffffffffffffe x18: 0000000000000020 [67841.045515] x17: 0000000000000000 x16: ffffa368b128a530 x15: ffffffffffffffff [67841.052888] x14: ffff8000b979bc18 x13: ffffffffffffffff x12: ffff8000b979bb18 [67841.060263] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa368b1289b18 [67841.067640] x8 : 0000000000000012 x7 : 0000000000000000 x6 : 00000000000003a9 [67841.075014] x5 : 0000000000000000 x4 : ffff002818c5cb00 x3 : 0000000000000001 [67841.082388] x2 : 0000000000000000 x1 : ffff002818c5cb00 x0 : 00000000000000a0 [67841.089759] Call trace: [67841.092456] down_write+0x30/0x98 [67841.096017] start_creating.part.0+0x60/0x198 [67841.100613] debugfs_create_dir+0x48/0x1f8 [67841.104950] debugfs_create_files_v3_hw+0x88/0x348 [hisi_sas_v3_hw] [67841.111447] debugfs_snapshot_regs_v3_hw+0x708/0x798 [hisi_sas_v3_hw] [67841.118111] debugfs_trigger_dump_v3_hw_write+0x9c/0x120 [hisi_sas_v3_hw] [67841.125115] full_proxy_write+0x68/0xc8 [67841.129175] vfs_write+0xd8/0x3f0 [67841.132708] ksys_write+0x70/0x108 [67841.136317] __arm64_sys_write+0x24/0x38 [67841.140440] invoke_syscall+0x50/0x128 [67841.144385] el0_svc_common.constprop.0+0xc8/0xf0 [67841.149273] do_el0_svc+0x24/0x38 [67841.152773] el0_svc+0x38/0xd8 [67841.156009] el0t_64_sync_handler+0xc0/0xc8 [67841.160361] el0t_64_sync+0x1a4/0x1a8 [67841.164189] Code: b9000882 d2800002 d2800023 f9800011 (c85ffc05) [67841.170443] ---[ end trace 0000000000000000 ]--- To fix this issue, create all directories and files during debugfs initialization. In this way, the driver only needs to allocate memory space to save information each time the user triggers dumping.
In the Linux kernel, the following vulnerability has been resolved: scsi: core: Fix bad pointer dereference when ehandler kthread is invalid Commit 66a834d09293 ("scsi: core: Fix error handling of scsi_host_alloc()") changed the allocation logic to call put_device() to perform host cleanup with the assumption that IDA removal and stopping the kthread would properly be performed in scsi_host_dev_release(). However, in the unlikely case that the error handler thread fails to spawn, shost->ehandler is set to ERR_PTR(-ENOMEM). The error handler cleanup code in scsi_host_dev_release() will call kthread_stop() if shost->ehandler != NULL which will always be the case whether the kthread was successfully spawned or not. In the case that it failed to spawn this has the nasty side effect of trying to dereference an invalid pointer when kthread_stop() is called. The following splat provides an example of this behavior in the wild: scsi host11: error handler thread failed to spawn, error = -4 Kernel attempted to read user page (10c) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x0000010c Faulting instruction address: 0xc00000000818e9a8 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvscsi(+) scsi_transport_srp dm_multipath dm_mirror dm_region hash dm_log dm_mod fuse overlay squashfs loop CPU: 12 PID: 274 Comm: systemd-udevd Not tainted 5.13.0-rc7 #1 NIP: c00000000818e9a8 LR: c0000000089846e8 CTR: 0000000000007ee8 REGS: c000000037d12ea0 TRAP: 0300 Not tainted (5.13.0-rc7) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28228228 XER: 20040001 CFAR: c0000000089846e4 DAR: 000000000000010c DSISR: 40000000 IRQMASK: 0 GPR00: c0000000089846e8 c000000037d13140 c000000009cc1100 fffffffffffffffc GPR04: 0000000000000001 0000000000000000 0000000000000000 c000000037dc0000 GPR08: 0000000000000000 c000000037dc0000 0000000000000001 00000000fffff7ff GPR12: 0000000000008000 c00000000a049000 c000000037d13d00 000000011134d5a0 GPR16: 0000000000001740 c0080000190d0000 c0080000190d1740 c000000009129288 GPR20: c000000037d13bc0 0000000000000001 c000000037d13bc0 c0080000190b7898 GPR24: c0080000190b7708 0000000000000000 c000000033bb2c48 0000000000000000 GPR28: c000000046b28280 0000000000000000 000000000000010c fffffffffffffffc NIP [c00000000818e9a8] kthread_stop+0x38/0x230 LR [c0000000089846e8] scsi_host_dev_release+0x98/0x160 Call Trace: [c000000033bb2c48] 0xc000000033bb2c48 (unreliable) [c0000000089846e8] scsi_host_dev_release+0x98/0x160 [c00000000891e960] device_release+0x60/0x100 [c0000000087e55c4] kobject_release+0x84/0x210 [c00000000891ec78] put_device+0x28/0x40 [c000000008984ea4] scsi_host_alloc+0x314/0x430 [c0080000190b38bc] ibmvscsi_probe+0x54/0xad0 [ibmvscsi] [c000000008110104] vio_bus_probe+0xa4/0x4b0 [c00000000892a860] really_probe+0x140/0x680 [c00000000892aefc] driver_probe_device+0x15c/0x200 [c00000000892b63c] device_driver_attach+0xcc/0xe0 [c00000000892b740] __driver_attach+0xf0/0x200 [c000000008926f28] bus_for_each_dev+0xa8/0x130 [c000000008929ce4] driver_attach+0x34/0x50 [c000000008928fc0] bus_add_driver+0x1b0/0x300 [c00000000892c798] driver_register+0x98/0x1a0 [c00000000810eb60] __vio_register_driver+0x80/0xe0 [c0080000190b4a30] ibmvscsi_module_init+0x9c/0xdc [ibmvscsi] [c0000000080121d0] do_one_initcall+0x60/0x2d0 [c000000008261abc] do_init_module+0x7c/0x320 [c000000008265700] load_module+0x2350/0x25b0 [c000000008265cb4] __do_sys_finit_module+0xd4/0x160 [c000000008031110] system_call_exception+0x150/0x2d0 [c00000000800d35c] system_call_common+0xec/0x278 Fix this be nulling shost->ehandler when the kthread fails to spawn.
In the Linux kernel, the following vulnerability has been resolved: jfs: fix GPF in diFree Avoid passing inode with JFS_SBI(inode->i_sb)->ipimap == NULL to diFree()[1]. GFP will appear: struct inode *ipimap = JFS_SBI(ip->i_sb)->ipimap; struct inomap *imap = JFS_IP(ipimap)->i_imap; JFS_IP() will return invalid pointer when ipimap == NULL Call Trace: diFree+0x13d/0x2dc0 fs/jfs/jfs_imap.c:853 [1] jfs_evict_inode+0x2c9/0x370 fs/jfs/inode.c:154 evict+0x2ed/0x750 fs/inode.c:578 iput_final fs/inode.c:1654 [inline] iput.part.0+0x3fe/0x820 fs/inode.c:1680 iput+0x58/0x70 fs/inode.c:1670
In the Linux kernel, the following vulnerability has been resolved: scsi: core: Put LLD module refcnt after SCSI device is released SCSI host release is triggered when SCSI device is freed. We have to make sure that the low-level device driver module won't be unloaded before SCSI host instance is released because shost->hostt is required in the release handler. Make sure to put LLD module refcnt after SCSI device is released. Fixes a kernel panic of 'BUG: unable to handle page fault for address' reported by Changhui and Yi.
An issue was discovered in drivers/media/dvb-core/dvb_frontend.c in the Linux kernel 6.2. There is a blocking operation when a task is in !TASK_RUNNING. In dvb_frontend_get_event, wait_event_interruptible is called; the condition is dvb_frontend_test_event(fepriv,events). In dvb_frontend_test_event, down(&fepriv->sem) is called. However, wait_event_interruptible would put the process to sleep, and down(&fepriv->sem) may block the process.
In the Linux kernel, the following vulnerability has been resolved: KVM: SVM: Use online_vcpus, not created_vcpus, to iterate over vCPUs Use the kvm_for_each_vcpu() helper to iterate over vCPUs when encrypting VMSAs for SEV, which effectively switches to use online_vcpus instead of created_vcpus. This fixes a possible null-pointer dereference as created_vcpus does not guarantee a vCPU exists, since it is updated at the very beginning of KVM_CREATE_VCPU. created_vcpus exists to allow the bulk of vCPU creation to run in parallel, while still correctly restricting the max number of max vCPUs.
NVIDIA GPU Display Driver for Windows and Linux contains a vulnerability in the kernel mode layer, where a NULL-pointer dereference may lead to denial of service.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix re-dirty process of tree-log nodes There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree.
In the Linux kernel, the following vulnerability has been resolved: net: lapb: increase LAPB_HEADER_LEN It is unclear if net/lapb code is supposed to be ready for 8021q. We can at least avoid crashes like the following : skbuff: skb_under_panic: text:ffffffff8aabe1f6 len:24 put:20 head:ffff88802824a400 data:ffff88802824a3fe tail:0x16 end:0x140 dev:nr0.2 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5508 Comm: dhcpcd Not tainted 6.12.0-rc7-syzkaller-00144-g66418447d27b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 2e 9e 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 1a 6f 37 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc90002ddf638 EFLAGS: 00010282 RAX: 0000000000000086 RBX: dffffc0000000000 RCX: 7a24750e538ff600 RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000 RBP: ffff888034a86650 R08: ffffffff8174b13c R09: 1ffff920005bbe60 R10: dffffc0000000000 R11: fffff520005bbe61 R12: 0000000000000140 R13: ffff88802824a400 R14: ffff88802824a3fe R15: 0000000000000016 FS: 00007f2a5990d740(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c2631fd CR3: 0000000029504000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 nr_header+0x36/0x320 net/netrom/nr_dev.c:69 dev_hard_header include/linux/netdevice.h:3148 [inline] vlan_dev_hard_header+0x359/0x480 net/8021q/vlan_dev.c:83 dev_hard_header include/linux/netdevice.h:3148 [inline] lapbeth_data_transmit+0x1f6/0x2a0 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x91/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x168/0x1f0 net/lapb/lapb_out.c:149 lapb_establish_data_link+0x84/0xd0 lapb_device_event+0x4e0/0x670 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93 __dev_notify_flags+0x207/0x400 dev_change_flags+0xf0/0x1a0 net/core/dev.c:8922 devinet_ioctl+0xa4e/0x1aa0 net/ipv4/devinet.c:1188 inet_ioctl+0x3d7/0x4f0 net/ipv4/af_inet.c:1003 sock_do_ioctl+0x158/0x460 net/socket.c:1227 sock_ioctl+0x626/0x8e0 net/socket.c:1346 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
An issue was discovered in drivers/mtd/ubi/cdev.c in the Linux kernel 6.2. There is a divide-by-zero error in do_div(sz,mtd->erasesize), used indirectly by ctrl_cdev_ioctl, when mtd->erasesize is 0.
In the Linux kernel, the following vulnerability has been resolved: mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize() If migration succeeded, we called folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the old to the new folio. This will set memcg_data of the old folio to 0. Similarly, if migration failed, memcg_data of the dst folio is left unset. If we call folio_putback_lru() on such folios (memcg_data == 0), we will add the folio to be freed to the LRU, making memcg code unhappy. Running the hmm selftests: # ./hmm-tests ... # RUN hmm.hmm_device_private.migrate ... [ 102.078007][T14893] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff27d200 pfn:0x13cc00 [ 102.079974][T14893] anon flags: 0x17ff00000020018(uptodate|dirty|swapbacked|node=0|zone=2|lastcpupid=0x7ff) [ 102.082037][T14893] raw: 017ff00000020018 dead000000000100 dead000000000122 ffff8881353896c9 [ 102.083687][T14893] raw: 00000007ff27d200 0000000000000000 00000001ffffffff 0000000000000000 [ 102.085331][T14893] page dumped because: VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled()) [ 102.087230][T14893] ------------[ cut here ]------------ [ 102.088279][T14893] WARNING: CPU: 0 PID: 14893 at ./include/linux/memcontrol.h:726 folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.090478][T14893] Modules linked in: [ 102.091244][T14893] CPU: 0 UID: 0 PID: 14893 Comm: hmm-tests Not tainted 6.13.0-09623-g6c216bc522fd #151 [ 102.093089][T14893] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 [ 102.094848][T14893] RIP: 0010:folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.096104][T14893] Code: ... [ 102.099908][T14893] RSP: 0018:ffffc900236c37b0 EFLAGS: 00010293 [ 102.101152][T14893] RAX: 0000000000000000 RBX: ffffea0004f30000 RCX: ffffffff8183f426 [ 102.102684][T14893] RDX: ffff8881063cb880 RSI: ffffffff81b8117f RDI: ffff8881063cb880 [ 102.104227][T14893] RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 [ 102.105757][T14893] R10: 0000000000000001 R11: 0000000000000002 R12: ffffc900236c37d8 [ 102.107296][T14893] R13: ffff888277a2bcb0 R14: 000000000000001f R15: 0000000000000000 [ 102.108830][T14893] FS: 00007ff27dbdd740(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000 [ 102.110643][T14893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 102.111924][T14893] CR2: 00007ff27d400000 CR3: 000000010866e000 CR4: 0000000000750ef0 [ 102.113478][T14893] PKRU: 55555554 [ 102.114172][T14893] Call Trace: [ 102.114805][T14893] <TASK> [ 102.115397][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.116547][T14893] ? __warn.cold+0x110/0x210 [ 102.117461][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.118667][T14893] ? report_bug+0x1b9/0x320 [ 102.119571][T14893] ? handle_bug+0x54/0x90 [ 102.120494][T14893] ? exc_invalid_op+0x17/0x50 [ 102.121433][T14893] ? asm_exc_invalid_op+0x1a/0x20 [ 102.122435][T14893] ? __wake_up_klogd.part.0+0x76/0xd0 [ 102.123506][T14893] ? dump_page+0x4f/0x60 [ 102.124352][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.125500][T14893] folio_batch_move_lru+0xd4/0x200 [ 102.126577][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.127505][T14893] __folio_batch_add_and_move+0x391/0x720 [ 102.128633][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.129550][T14893] folio_putback_lru+0x16/0x80 [ 102.130564][T14893] migrate_device_finalize+0x9b/0x530 [ 102.131640][T14893] dmirror_migrate_to_device.constprop.0+0x7c5/0xad0 [ 102.133047][T14893] dmirror_fops_unlocked_ioctl+0x89b/0xc80 Likely, nothing else goes wrong: putting the last folio reference will remove the folio from the LRU again. So besides memcg complaining, adding the folio to be freed to the LRU is just an unnecessary step. The new flow resembles what we have in migrate_folio_move(): add the dst to the lru, rem ---truncated---
An issue was discovered in set_con2fb_map in drivers/video/fbdev/core/fbcon.c in the Linux kernel before 6.2.12. Because an assignment occurs only for the first vc, the fbcon_registered_fb and fbcon_display arrays can be desynchronized in fbcon_mode_deleted (the con2fb_map points at the old fb_info).
A flaw was found in the Linux kernel. A null pointer dereference in bond_ipsec_add_sa() may lead to local denial of service.
A flaw was found in the Framebuffer Console (fbcon) in the Linux Kernel. When providing font->width and font->height greater than 32 to fbcon_set_font, since there are no checks in place, a shift-out-of-bounds occurs leading to undefined behavior and possible denial of service.
NVIDIA GPU Driver for Windows and Linux contains a vulnerability in the kernel mode layer, where an unprivileged regular user can cause a NULL-pointer dereference, which may lead to denial of service.
A NULL pointer dereference flaw was found in the Linux kernel's BPF subsystem in the way a user triggers the map_get_next_key function of the BPF bloom filter. This flaw allows a local user to crash the system. This flaw affects Linux kernel versions prior to 5.17-rc1.
The __do_follow_link function in fs/namei.c in the Linux kernel before 2.6.33 does not properly handle the last pathname component during use of certain filesystems, which allows local users to cause a denial of service (incorrect free operations and system crash) via an open system call.
A flaw was found in the Linux kernel. The existing KVM SEV API has a vulnerability that allows a non-root (host) user-level application to crash the host kernel by creating a confidential guest VM instance in AMD CPU that supports Secure Encrypted Virtualization (SEV).
In the Linux kernel, the following vulnerability has been resolved: wifi: brcmfmac: Fix oops due to NULL pointer dereference in brcmf_sdiod_sglist_rw() This patch fixes a NULL pointer dereference bug in brcmfmac that occurs when a high 'sd_sgentry_align' value applies (e.g. 512) and a lot of queued SKBs are sent from the pkt queue. The problem is the number of entries in the pre-allocated sgtable, it is nents = max(rxglom_size, txglom_size) + max(rxglom_size, txglom_size) >> 4 + 1. Given the default [rt]xglom_size=32 it's actually 35 which is too small. Worst case, the pkt queue can end up with 64 SKBs. This occurs when a new SKB is added for each original SKB if tailroom isn't enough to hold tail_pad. At least one sg entry is needed for each SKB. So, eventually the "skb_queue_walk loop" in brcmf_sdiod_sglist_rw may run out of sg entries. This makes sg_next return NULL and this causes the oops. The patch sets nents to max(rxglom_size, txglom_size) * 2 to be able handle the worst-case. Btw. this requires only 64-35=29 * 16 (or 20 if CONFIG_NEED_SG_DMA_LENGTH) = 464 additional bytes of memory.
A flaw was found in the Linux kernel's implementation of GRO in versions before 5.2. This flaw allows an attacker with local access to crash the system.
In the Linux kernel, the following vulnerability has been resolved: rtnetlink: Allocate vfinfo size for VF GUIDs when supported Commit 30aad41721e0 ("net/core: Add support for getting VF GUIDs") added support for getting VF port and node GUIDs in netlink ifinfo messages, but their size was not taken into consideration in the function that allocates the netlink message, causing the following warning when a netlink message is filled with many VF port and node GUIDs: # echo 64 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs # ip link show dev ib0 RTNETLINK answers: Message too long Cannot send link get request: Message too long Kernel warning: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1930 at net/core/rtnetlink.c:4151 rtnl_getlink+0x586/0x5a0 Modules linked in: xt_conntrack xt_MASQUERADE nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay mlx5_ib macsec mlx5_core tls rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm iw_cm ib_ipoib fuse ib_cm ib_core CPU: 2 UID: 0 PID: 1930 Comm: ip Not tainted 6.14.0-rc2+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:rtnl_getlink+0x586/0x5a0 Code: cb 82 e8 3d af 0a 00 4d 85 ff 0f 84 08 ff ff ff 4c 89 ff 41 be ea ff ff ff e8 66 63 5b ff 49 c7 07 80 4f cb 82 e9 36 fc ff ff <0f> 0b e9 16 fe ff ff e8 de a0 56 00 66 66 2e 0f 1f 84 00 00 00 00 RSP: 0018:ffff888113557348 EFLAGS: 00010246 RAX: 00000000ffffffa6 RBX: ffff88817e87aa34 RCX: dffffc0000000000 RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffff88817e87afb8 RBP: 0000000000000009 R08: ffffffff821f44aa R09: 0000000000000000 R10: ffff8881260f79a8 R11: ffff88817e87af00 R12: ffff88817e87aa00 R13: ffffffff8563d300 R14: 00000000ffffffa6 R15: 00000000ffffffff FS: 00007f63a5dbf280(0000) GS:ffff88881ee00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f63a5ba4493 CR3: 00000001700fe002 CR4: 0000000000772eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xa5/0x230 ? rtnl_getlink+0x586/0x5a0 ? report_bug+0x22d/0x240 ? handle_bug+0x53/0xa0 ? exc_invalid_op+0x14/0x50 ? asm_exc_invalid_op+0x16/0x20 ? skb_trim+0x6a/0x80 ? rtnl_getlink+0x586/0x5a0 ? __pfx_rtnl_getlink+0x10/0x10 ? rtnetlink_rcv_msg+0x1e5/0x860 ? __pfx___mutex_lock+0x10/0x10 ? rcu_is_watching+0x34/0x60 ? __pfx_lock_acquire+0x10/0x10 ? stack_trace_save+0x90/0xd0 ? filter_irq_stacks+0x1d/0x70 ? kasan_save_stack+0x30/0x40 ? kasan_save_stack+0x20/0x40 ? kasan_save_track+0x10/0x30 rtnetlink_rcv_msg+0x21c/0x860 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? __pfx_rtnetlink_rcv_msg+0x10/0x10 ? arch_stack_walk+0x9e/0xf0 ? rcu_is_watching+0x34/0x60 ? lock_acquire+0xd5/0x410 ? rcu_is_watching+0x34/0x60 netlink_rcv_skb+0xe0/0x210 ? __pfx_rtnetlink_rcv_msg+0x10/0x10 ? __pfx_netlink_rcv_skb+0x10/0x10 ? rcu_is_watching+0x34/0x60 ? __pfx___netlink_lookup+0x10/0x10 ? lock_release+0x62/0x200 ? netlink_deliver_tap+0xfd/0x290 ? rcu_is_watching+0x34/0x60 ? lock_release+0x62/0x200 ? netlink_deliver_tap+0x95/0x290 netlink_unicast+0x31f/0x480 ? __pfx_netlink_unicast+0x10/0x10 ? rcu_is_watching+0x34/0x60 ? lock_acquire+0xd5/0x410 netlink_sendmsg+0x369/0x660 ? lock_release+0x62/0x200 ? __pfx_netlink_sendmsg+0x10/0x10 ? import_ubuf+0xb9/0xf0 ? __import_iovec+0x254/0x2b0 ? lock_release+0x62/0x200 ? __pfx_netlink_sendmsg+0x10/0x10 ____sys_sendmsg+0x559/0x5a0 ? __pfx_____sys_sendmsg+0x10/0x10 ? __pfx_copy_msghdr_from_user+0x10/0x10 ? rcu_is_watching+0x34/0x60 ? do_read_fault+0x213/0x4a0 ? rcu_is_watching+0x34/0x60 ___sys_sendmsg+0xe4/0x150 ? __pfx____sys_sendmsg+0x10/0x10 ? do_fault+0x2cc/0x6f0 ? handle_pte_fault+0x2e3/0x3d0 ? __pfx_handle_pte_fault+0x10/0x10 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix memleak in io_init_wq_offload() I got memory leak report when doing fuzz test: BUG: memory leak unreferenced object 0xffff888107310a80 (size 96): comm "syz-executor.6", pid 4610, jiffies 4295140240 (age 20.135s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... backtrace: [<000000001974933b>] kmalloc include/linux/slab.h:591 [inline] [<000000001974933b>] kzalloc include/linux/slab.h:721 [inline] [<000000001974933b>] io_init_wq_offload fs/io_uring.c:7920 [inline] [<000000001974933b>] io_uring_alloc_task_context+0x466/0x640 fs/io_uring.c:7955 [<0000000039d0800d>] __io_uring_add_tctx_node+0x256/0x360 fs/io_uring.c:9016 [<000000008482e78c>] io_uring_add_tctx_node fs/io_uring.c:9052 [inline] [<000000008482e78c>] __do_sys_io_uring_enter fs/io_uring.c:9354 [inline] [<000000008482e78c>] __se_sys_io_uring_enter fs/io_uring.c:9301 [inline] [<000000008482e78c>] __x64_sys_io_uring_enter+0xabc/0xc20 fs/io_uring.c:9301 [<00000000b875f18f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000b875f18f>] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 [<000000006b0a8484>] entry_SYSCALL_64_after_hwframe+0x44/0xae CPU0 CPU1 io_uring_enter io_uring_enter io_uring_add_tctx_node io_uring_add_tctx_node __io_uring_add_tctx_node __io_uring_add_tctx_node io_uring_alloc_task_context io_uring_alloc_task_context io_init_wq_offload io_init_wq_offload hash = kzalloc hash = kzalloc ctx->hash_map = hash ctx->hash_map = hash <- one of the hash is leaked When calling io_uring_enter() in parallel, the 'hash_map' will be leaked, add uring_lock to protect 'hash_map'.
In the Linux kernel, the following vulnerability has been resolved: xhci: Fix command ring pointer corruption while aborting a command The command ring pointer is located at [6:63] bits of the command ring control register (CRCR). All the control bits like command stop, abort are located at [0:3] bits. While aborting a command, we read the CRCR and set the abort bit and write to the CRCR. The read will always give command ring pointer as all zeros. So we essentially write only the control bits. Since we split the 64 bit write into two 32 bit writes, there is a possibility of xHC command ring stopped before the upper dword (all zeros) is written. If that happens, xHC updates the upper dword of its internal command ring pointer with all zeros. Next time, when the command ring is restarted, we see xHC memory access failures. Fix this issue by only writing to the lower dword of CRCR where all control bits are located.
In the Linux kernel, the following vulnerability has been resolved: scsi: qla2xxx: Reserve extra IRQ vectors Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs. That breaks vector allocation assumptions in qla83xx_iospace_config(), qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions computes maximum number of qpairs as: ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default response queue) - 1 (ATIO, in dual or pure target mode) max_qpairs is set to zero in case of two CPUs and initiator mode. The number is then used to allocate ha->queue_pair_map inside qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is left NULL but the driver thinks there are queue pairs available. qla2xxx_queuecommand() tries to find a qpair in the map and crashes: if (ha->mqenable) { uint32_t tag; uint16_t hwq; struct qla_qpair *qpair = NULL; tag = blk_mq_unique_tag(cmd->request); hwq = blk_mq_unique_tag_to_hwq(tag); qpair = ha->queue_pair_map[hwq]; # <- HERE if (qpair) return qla2xxx_mqueuecommand(host, cmd, qpair); } BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G W 5.10.0-rc1+ #25 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc] RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx] Call Trace: scsi_queue_rq+0x58c/0xa60 blk_mq_dispatch_rq_list+0x2b7/0x6f0 ? __sbitmap_get_word+0x2a/0x80 __blk_mq_sched_dispatch_requests+0xb8/0x170 blk_mq_sched_dispatch_requests+0x2b/0x50 __blk_mq_run_hw_queue+0x49/0xb0 __blk_mq_delay_run_hw_queue+0xfb/0x150 blk_mq_sched_insert_request+0xbe/0x110 blk_execute_rq+0x45/0x70 __scsi_execute+0x10e/0x250 scsi_probe_and_add_lun+0x228/0xda0 __scsi_scan_target+0xf4/0x620 ? __pm_runtime_resume+0x4f/0x70 scsi_scan_target+0x100/0x110 fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc] process_one_work+0x1ea/0x3b0 worker_thread+0x28/0x3b0 ? process_one_work+0x3b0/0x3b0 kthread+0x112/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 The driver should allocate enough vectors to provide every CPU it's own HW queue and still handle reserved (MB, RSP, ATIO) interrupts. The change fixes the crash on dual core VM and prevents unbalanced QP allocation where nr_hw_queues is two less than the number of CPUs.
A flaw was found in the IPv6 module of the Linux kernel. The arg.result was not used consistently in fib6_rule_lookup, sometimes holding rt6_info and other times fib6_info. This was not accounted for in other parts of the code where rt6_info was expected unconditionally, potentially leading to a kernel panic in fib6_rule_suppress.
In the Linux kernel, the following vulnerability has been resolved: libbpf: Fix memory leak in strset Free struct strset itself, not just its internal parts.
In the Linux kernel, the following vulnerability has been resolved: ext4: always panic when errors=panic is specified Before commit 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()"), the following series of commands would trigger a panic: 1. mount /dev/sda -o ro,errors=panic test 2. mount /dev/sda -o remount,abort test After commit 014c9caa29d3, remounting a file system using the test mount option "abort" will no longer trigger a panic. This commit will restore the behaviour immediately before commit 014c9caa29d3. (However, note that the Linux kernel's behavior has not been consistent; some previous kernel versions, including 5.4 and 4.19 similarly did not panic after using the mount option "abort".) This also makes a change to long-standing behaviour; namely, the following series commands will now cause a panic, when previously it did not: 1. mount /dev/sda -o ro,errors=panic test 2. echo test > /sys/fs/ext4/sda/trigger_fs_error However, this makes ext4's behaviour much more consistent, so this is a good thing.
In the Linux kernel, the following vulnerability has been resolved: net: fujitsu: fix potential null-ptr-deref In fmvj18x_get_hwinfo(), if ioremap fails there will be NULL pointer deref. To fix this, check the return value of ioremap and return -1 to the caller in case of failure.
In the Linux kernel, the following vulnerability has been resolved: scsi: ufs: core: Improve SCSI abort handling The following has been observed on a test setup: WARNING: CPU: 4 PID: 250 at drivers/scsi/ufs/ufshcd.c:2737 ufshcd_queuecommand+0x468/0x65c Call trace: ufshcd_queuecommand+0x468/0x65c scsi_send_eh_cmnd+0x224/0x6a0 scsi_eh_test_devices+0x248/0x418 scsi_eh_ready_devs+0xc34/0xe58 scsi_error_handler+0x204/0x80c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 That warning is triggered by the following statement: WARN_ON(lrbp->cmd); Fix this warning by clearing lrbp->cmd from the abort handler.
In the Linux kernel, the following vulnerability has been resolved: ubifs: Fix to add refcount once page is set private MM defined the rule [1] very clearly that once page was set with PG_private flag, we should increment the refcount in that page, also main flows like pageout(), migrate_page() will assume there is one additional page reference count if page_has_private() returns true. Otherwise, we may get a BUG in page migration: page:0000000080d05b9d refcount:-1 mapcount:0 mapping:000000005f4d82a8 index:0xe2 pfn:0x14c12 aops:ubifs_file_address_operations [ubifs] ino:8f1 dentry name:"f30e" flags: 0x1fffff80002405(locked|uptodate|owner_priv_1|private|node=0| zone=1|lastcpupid=0x1fffff) page dumped because: VM_BUG_ON_PAGE(page_count(page) != 0) ------------[ cut here ]------------ kernel BUG at include/linux/page_ref.h:184! invalid opcode: 0000 [#1] SMP CPU: 3 PID: 38 Comm: kcompactd0 Not tainted 5.15.0-rc5 RIP: 0010:migrate_page_move_mapping+0xac3/0xe70 Call Trace: ubifs_migrate_page+0x22/0xc0 [ubifs] move_to_new_page+0xb4/0x600 migrate_pages+0x1523/0x1cc0 compact_zone+0x8c5/0x14b0 kcompactd+0x2bc/0x560 kthread+0x18c/0x1e0 ret_from_fork+0x1f/0x30 Before the time, we should make clean a concept, what does refcount means in page gotten from grab_cache_page_write_begin(). There are 2 situations: Situation 1: refcount is 3, page is created by __page_cache_alloc. TYPE_A - the write process is using this page TYPE_B - page is assigned to one certain mapping by calling __add_to_page_cache_locked() TYPE_C - page is added into pagevec list corresponding current cpu by calling lru_cache_add() Situation 2: refcount is 2, page is gotten from the mapping's tree TYPE_B - page has been assigned to one certain mapping TYPE_A - the write process is using this page (by calling page_cache_get_speculative()) Filesystem releases one refcount by calling put_page() in xxx_write_end(), the released refcount corresponds to TYPE_A (write task is using it). If there are any processes using a page, page migration process will skip the page by judging whether expected_page_refs() equals to page refcount. The BUG is caused by following process: PA(cpu 0) kcompactd(cpu 1) compact_zone ubifs_write_begin page_a = grab_cache_page_write_begin add_to_page_cache_lru lru_cache_add pagevec_add // put page into cpu 0's pagevec (refcnf = 3, for page creation process) ubifs_write_end SetPagePrivate(page_a) // doesn't increase page count ! unlock_page(page_a) put_page(page_a) // refcnt = 2 [...] PB(cpu 0) filemap_read filemap_get_pages add_to_page_cache_lru lru_cache_add __pagevec_lru_add // traverse all pages in cpu 0's pagevec __pagevec_lru_add_fn SetPageLRU(page_a) isolate_migratepages isolate_migratepages_block get_page_unless_zero(page_a) // refcnt = 3 list_add(page_a, from_list) migrate_pages(from_list) __unmap_and_move move_to_new_page ubifs_migrate_page(page_a) migrate_page_move_mapping expected_page_refs get 3 (migration[1] + mapping[1] + private[1]) release_pages put_page_testzero(page_a) // refcnt = 3 page_ref_freeze // refcnt = 0 page_ref_dec_and_test(0 - 1 = -1) page_ref_unfreeze VM_BUG_ON_PAGE(-1 != 0, page) UBIFS doesn't increase the page refcount after setting private flag, which leads to page migration task believes the page is not used by any other processes, so the page is migrated. This causes concurrent accessing on page refcount between put_page() called by other process(eg. read process calls lru_cache_add) and page_ref_unfreeze() called by mi ---truncated---
In the Linux kernel, the following vulnerability has been resolved: s390/zcrypt: fix zcard and zqueue hot-unplug memleak Tests with kvm and a kmemdebug kernel showed, that on hot unplug the zcard and zqueue structs for the unplugged card or queue are not properly freed because of a mismatch with get/put for the embedded kref counter. This fix now adjusts the handling of the kref counters. With init the kref counter starts with 1. This initial value needs to drop to zero with the unregister of the card or queue to trigger the release and free the object.
In the Linux kernel, the following vulnerability has been resolved: net: stmmac: Disable Tx queues when reconfiguring the interface The Tx queues were not disabled in situations where the driver needed to stop the interface to apply a new configuration. This could result in a kernel panic when doing any of the 3 following actions: * reconfiguring the number of queues (ethtool -L) * reconfiguring the size of the ring buffers (ethtool -G) * installing/removing an XDP program (ip l set dev ethX xdp) Prevent the panic by making sure netif_tx_disable is called when stopping an interface. Without this patch, the following kernel panic can be observed when doing any of the actions above: Unable to handle kernel paging request at virtual address ffff80001238d040 [....] Call trace: dwmac4_set_addr+0x8/0x10 dev_hard_start_xmit+0xe4/0x1ac sch_direct_xmit+0xe8/0x39c __dev_queue_xmit+0x3ec/0xaf0 dev_queue_xmit+0x14/0x20 [...] [ end trace 0000000000000002 ]---
A use after free flaw was found in hfsplus_put_super in fs/hfsplus/super.c in the Linux Kernel. This flaw could allow a local user to cause a denial of service problem.
In the Linux kernel, the following vulnerability has been resolved: usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL There exists a possible scenario in which dwc3_gadget_init() can fail: during during host -> peripheral mode switch in dwc3_set_mode(), and a pending gadget driver fails to bind. Then, if the DRD undergoes another mode switch from peripheral->host the resulting dwc3_gadget_exit() will attempt to reference an invalid and dangling dwc->gadget pointer as well as call dma_free_coherent() on unmapped DMA pointers. The exact scenario can be reproduced as follows: - Start DWC3 in peripheral mode - Configure ConfigFS gadget with FunctionFS instance (or use g_ffs) - Run FunctionFS userspace application (open EPs, write descriptors, etc) - Bind gadget driver to DWC3's UDC - Switch DWC3 to host mode => dwc3_gadget_exit() is called. usb_del_gadget() will put the ConfigFS driver instance on the gadget_driver_pending_list - Stop FunctionFS application (closes the ep files) - Switch DWC3 to peripheral mode => dwc3_gadget_init() fails as usb_add_gadget() calls check_pending_gadget_drivers() and attempts to rebind the UDC to the ConfigFS gadget but fails with -19 (-ENODEV) because the FFS instance is not in FFS_ACTIVE state (userspace has not re-opened and written the descriptors yet, i.e. desc_ready!=0). - Switch DWC3 back to host mode => dwc3_gadget_exit() is called again, but this time dwc->gadget is invalid. Although it can be argued that userspace should take responsibility for ensuring that the FunctionFS application be ready prior to allowing the composite driver bind to the UDC, failure to do so should not result in a panic from the kernel driver. Fix this by setting dwc->gadget to NULL in the failure path of dwc3_gadget_init() and add a check to dwc3_gadget_exit() to bail out unless the gadget pointer is valid.
In the Linux kernel, the following vulnerability has been resolved: io_uring: fix shared sqpoll cancellation hangs [ 736.982891] INFO: task iou-sqp-4294:4295 blocked for more than 122 seconds. [ 736.982897] Call Trace: [ 736.982901] schedule+0x68/0xe0 [ 736.982903] io_uring_cancel_sqpoll+0xdb/0x110 [ 736.982908] io_sqpoll_cancel_cb+0x24/0x30 [ 736.982911] io_run_task_work_head+0x28/0x50 [ 736.982913] io_sq_thread+0x4e3/0x720 We call io_uring_cancel_sqpoll() one by one for each ctx either in sq_thread() itself or via task works, and it's intended to cancel all requests of a specified context. However the function uses per-task counters to track the number of inflight requests, so it counts more requests than available via currect io_uring ctx and goes to sleep for them to appear (e.g. from IRQ), that will never happen. Cancel a bit more than before, i.e. all ctxs that share sqpoll and continue to use shared counters. Don't forget that we should not remove ctx from the list before running that task_work sqpoll-cancel, otherwise the function wouldn't be able to find the context and will hang.
In the Linux kernel, the following vulnerability has been resolved: net: hns3: do not allow call hns3_nic_net_open repeatedly hns3_nic_net_open() is not allowed to called repeatly, but there is no checking for this. When doing device reset and setup tc concurrently, there is a small oppotunity to call hns3_nic_net_open repeatedly, and cause kernel bug by calling napi_enable twice. The calltrace information is like below: [ 3078.222780] ------------[ cut here ]------------ [ 3078.230255] kernel BUG at net/core/dev.c:6991! [ 3078.236224] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 3078.243431] Modules linked in: hns3 hclgevf hclge hnae3 vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [ 3078.258880] CPU: 0 PID: 295 Comm: kworker/u8:5 Tainted: G O 5.14.0-rc4+ #1 [ 3078.269102] Hardware name: , BIOS KpxxxFPGA 1P B600 V181 08/12/2021 [ 3078.276801] Workqueue: hclge hclge_service_task [hclge] [ 3078.288774] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 3078.296168] pc : napi_enable+0x80/0x84 tc qdisc sho[w 3d0e7v8 .e3t0h218 79] lr : hns3_nic_net_open+0x138/0x510 [hns3] [ 3078.314771] sp : ffff8000108abb20 [ 3078.319099] x29: ffff8000108abb20 x28: 0000000000000000 x27: ffff0820a8490300 [ 3078.329121] x26: 0000000000000001 x25: ffff08209cfc6200 x24: 0000000000000000 [ 3078.339044] x23: ffff0820a8490300 x22: ffff08209cd76000 x21: ffff0820abfe3880 [ 3078.349018] x20: 0000000000000000 x19: ffff08209cd76900 x18: 0000000000000000 [ 3078.358620] x17: 0000000000000000 x16: ffffc816e1727a50 x15: 0000ffff8f4ff930 [ 3078.368895] x14: 0000000000000000 x13: 0000000000000000 x12: 0000259e9dbeb6b4 [ 3078.377987] x11: 0096a8f7e764eb40 x10: 634615ad28d3eab5 x9 : ffffc816ad8885b8 [ 3078.387091] x8 : ffff08209cfc6fb8 x7 : ffff0820ac0da058 x6 : ffff0820a8490344 [ 3078.396356] x5 : 0000000000000140 x4 : 0000000000000003 x3 : ffff08209cd76938 [ 3078.405365] x2 : 0000000000000000 x1 : 0000000000000010 x0 : ffff0820abfe38a0 [ 3078.414657] Call trace: [ 3078.418517] napi_enable+0x80/0x84 [ 3078.424626] hns3_reset_notify_up_enet+0x78/0xd0 [hns3] [ 3078.433469] hns3_reset_notify+0x64/0x80 [hns3] [ 3078.441430] hclge_notify_client+0x68/0xb0 [hclge] [ 3078.450511] hclge_reset_rebuild+0x524/0x884 [hclge] [ 3078.458879] hclge_reset_service_task+0x3c4/0x680 [hclge] [ 3078.467470] hclge_service_task+0xb0/0xb54 [hclge] [ 3078.475675] process_one_work+0x1dc/0x48c [ 3078.481888] worker_thread+0x15c/0x464 [ 3078.487104] kthread+0x160/0x170 [ 3078.492479] ret_from_fork+0x10/0x18 [ 3078.498785] Code: c8027c81 35ffffa2 d50323bf d65f03c0 (d4210000) [ 3078.506889] ---[ end trace 8ebe0340a1b0fb44 ]--- Once hns3_nic_net_open() is excute success, the flag HNS3_NIC_STATE_DOWN will be cleared. So add checking for this flag, directly return when HNS3_NIC_STATE_DOWN is no set.
In the Linux kernel, the following vulnerability has been resolved: x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range() If track_pfn_copy() fails, we already added the dst VMA to the maple tree. As fork() fails, we'll cleanup the maple tree, and stumble over the dst VMA for which we neither performed any reservation nor copied any page tables. Consequently untrack_pfn() will see VM_PAT and try obtaining the PAT information from the page table -- which fails because the page table was not copied. The easiest fix would be to simply clear the VM_PAT flag of the dst VMA if track_pfn_copy() fails. However, the whole thing is about "simply" clearing the VM_PAT flag is shaky as well: if we passed track_pfn_copy() and performed a reservation, but copying the page tables fails, we'll simply clear the VM_PAT flag, not properly undoing the reservation ... which is also wrong. So let's fix it properly: set the VM_PAT flag only if the reservation succeeded (leaving it clear initially), and undo the reservation if anything goes wrong while copying the page tables: clearing the VM_PAT flag after undoing the reservation. Note that any copied page table entries will get zapped when the VMA will get removed later, after copy_page_range() succeeded; as VM_PAT is not set then, we won't try cleaning VM_PAT up once more and untrack_pfn() will be happy. Note that leaving these page tables in place without a reservation is not a problem, as we are aborting fork(); this process will never run. A reproducer can trigger this usually at the first try: https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/pat_fork.c WARNING: CPU: 26 PID: 11650 at arch/x86/mm/pat/memtype.c:983 get_pat_info+0xf6/0x110 Modules linked in: ... CPU: 26 UID: 0 PID: 11650 Comm: repro3 Not tainted 6.12.0-rc5+ #92 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:get_pat_info+0xf6/0x110 ... Call Trace: <TASK> ... untrack_pfn+0x52/0x110 unmap_single_vma+0xa6/0xe0 unmap_vmas+0x105/0x1f0 exit_mmap+0xf6/0x460 __mmput+0x4b/0x120 copy_process+0x1bf6/0x2aa0 kernel_clone+0xab/0x440 __do_sys_clone+0x66/0x90 do_syscall_64+0x95/0x180 Likely this case was missed in: d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed") ... and instead of undoing the reservation we simply cleared the VM_PAT flag. Keep the documentation of these functions in include/linux/pgtable.h, one place is more than sufficient -- we should clean that up for the other functions like track_pfn_remap/untrack_pfn separately.
In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: CT, Fix multiple allocations and memleak of mod acts CT clear action offload adds additional mod hdr actions to the flow's original mod actions in order to clear the registers which hold ct_state. When such flow also includes encap action, a neigh update event can cause the driver to unoffload the flow and then reoffload it. Each time this happens, the ct clear handling adds that same set of mod hdr actions to reset ct_state until the max of mod hdr actions is reached. Also the driver never releases the allocated mod hdr actions and causing a memleak. Fix above two issues by moving CT clear mod acts allocation into the parsing actions phase and only use it when offloading the rule. The release of mod acts will be done in the normal flow_put(). backtrace: [<000000007316e2f3>] krealloc+0x83/0xd0 [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core] [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core] [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core] [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core] [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core] [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core] [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core] [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core] [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core]
In the Linux kernel, the following vulnerability has been resolved: RDMA: Verify port when creating flow rule Validate port value provided by the user and with that remove no longer needed validation by the driver. The missing check in the mlx5_ib driver could cause to the below oops. Call trace: _create_flow_rule+0x2d4/0xf28 [mlx5_ib] mlx5_ib_create_flow+0x2d0/0x5b0 [mlx5_ib] ib_uverbs_ex_create_flow+0x4cc/0x624 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xd4/0x150 [ib_uverbs] ib_uverbs_cmd_verbs.isra.7+0xb28/0xc50 [ib_uverbs] ib_uverbs_ioctl+0x158/0x1d0 [ib_uverbs] do_vfs_ioctl+0xd0/0xaf0 ksys_ioctl+0x84/0xb4 __arm64_sys_ioctl+0x28/0xc4 el0_svc_common.constprop.3+0xa4/0x254 el0_svc_handler+0x84/0xa0 el0_svc+0x10/0x26c Code: b9401260 f9615681 51000400 8b001c20 (f9403c1a)
In the Linux kernel, the following vulnerability has been resolved: scsi: pm80xx: Fix memory leak during rmmod Driver failed to release all memory allocated. This would lead to memory leak during driver removal. Properly free memory when the module is removed.
In the Linux kernel, the following vulnerability has been resolved: bnxt_en: Fix RX consumer index logic in the error path. In bnxt_rx_pkt(), the RX buffers are expected to complete in order. If the RX consumer index indicates an out of order buffer completion, it means we are hitting a hardware bug and the driver will abort all remaining RX packets and reset the RX ring. The RX consumer index that we pass to bnxt_discard_rx() is not correct. We should be passing the current index (tmp_raw_cons) instead of the old index (raw_cons). This bug can cause us to be at the wrong index when trying to abort the next RX packet. It can crash like this: #0 [ffff9bbcdf5c39a8] machine_kexec at ffffffff9b05e007 #1 [ffff9bbcdf5c3a00] __crash_kexec at ffffffff9b111232 #2 [ffff9bbcdf5c3ad0] panic at ffffffff9b07d61e #3 [ffff9bbcdf5c3b50] oops_end at ffffffff9b030978 #4 [ffff9bbcdf5c3b78] no_context at ffffffff9b06aaf0 #5 [ffff9bbcdf5c3bd8] __bad_area_nosemaphore at ffffffff9b06ae2e #6 [ffff9bbcdf5c3c28] bad_area_nosemaphore at ffffffff9b06af24 #7 [ffff9bbcdf5c3c38] __do_page_fault at ffffffff9b06b67e #8 [ffff9bbcdf5c3cb0] do_page_fault at ffffffff9b06bb12 #9 [ffff9bbcdf5c3ce0] page_fault at ffffffff9bc015c5 [exception RIP: bnxt_rx_pkt+237] RIP: ffffffffc0259cdd RSP: ffff9bbcdf5c3d98 RFLAGS: 00010213 RAX: 000000005dd8097f RBX: ffff9ba4cb11b7e0 RCX: ffffa923cf6e9000 RDX: 0000000000000fff RSI: 0000000000000627 RDI: 0000000000001000 RBP: ffff9bbcdf5c3e60 R8: 0000000000420003 R9: 000000000000020d R10: ffffa923cf6ec138 R11: ffff9bbcdf5c3e83 R12: ffff9ba4d6f928c0 R13: ffff9ba4cac28080 R14: ffff9ba4cb11b7f0 R15: ffff9ba4d5a30000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
In the Linux kernel, the following vulnerability has been resolved: netrom: Decrease sock refcount when sock timers expire Commit 63346650c1a9 ("netrom: switch to sock timer API") switched to use sock timer API. It replaces mod_timer() by sk_reset_timer(), and del_timer() by sk_stop_timer(). Function sk_reset_timer() will increase the refcount of sock if it is called on an inactive timer, hence, in case the timer expires, we need to decrease the refcount ourselves in the handler, otherwise, the sock refcount will be unbalanced and the sock will never be freed.
In the Linux kernel, the following vulnerability has been resolved: powerpc/64s: Fix unrecoverable MCE calling async handler from NMI The machine check handler is not considered NMI on 64s. The early handler is the true NMI handler, and then it schedules the machine_check_exception handler to run when interrupts are enabled. This works fine except the case of an unrecoverable MCE, where the true NMI is taken when MSR[RI] is clear, it can not recover, so it calls machine_check_exception directly so something might be done about it. Calling an async handler from NMI context can result in irq state and other things getting corrupted. This can also trigger the BUG at arch/powerpc/include/asm/interrupt.h:168 BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE)); Fix this by making an _async version of the handler which is called in the normal case, and a NMI version that is called for unrecoverable interrupts.
In the Linux kernel, the following vulnerability has been resolved: HID: magicmouse: fix NULL-deref on disconnect Commit 9d7b18668956 ("HID: magicmouse: add support for Apple Magic Trackpad 2") added a sanity check for an Apple trackpad but returned success instead of -ENODEV when the check failed. This means that the remove callback will dereference the never-initialised driver data pointer when the driver is later unbound (e.g. on USB disconnect).
In the Linux kernel, the following vulnerability has been resolved: tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized This commit fixes a bug (found by syzkaller) that could cause spurious double-initializations for congestion control modules, which could cause memory leaks or other problems for congestion control modules (like CDG) that allocate memory in their init functions. The buggy scenario constructed by syzkaller was something like: (1) create a TCP socket (2) initiate a TFO connect via sendto() (3) while socket is in TCP_SYN_SENT, call setsockopt(TCP_CONGESTION), which calls: tcp_set_congestion_control() -> tcp_reinit_congestion_control() -> tcp_init_congestion_control() (4) receive ACK, connection is established, call tcp_init_transfer(), set icsk_ca_initialized=0 (without first calling cc->release()), call tcp_init_congestion_control() again. Note that in this sequence tcp_init_congestion_control() is called twice without a cc->release() call in between. Thus, for CC modules that allocate memory in their init() function, e.g, CDG, a memory leak may occur. The syzkaller tool managed to find a reproducer that triggered such a leak in CDG. The bug was introduced when that commit 8919a9b31eb4 ("tcp: Only init congestion control if not initialized already") introduced icsk_ca_initialized and set icsk_ca_initialized to 0 in tcp_init_transfer(), missing the possibility for a sequence like the one above, where a process could call setsockopt(TCP_CONGESTION) in state TCP_SYN_SENT (i.e. after the connect() or TFO open sendmsg()), which would call tcp_init_congestion_control(). It did not intend to reset any initialization that the user had already explicitly made; it just missed the possibility of that particular sequence (which syzkaller managed to find).
In the Linux kernel, the following vulnerability has been resolved: SUNRPC: Fix null pointer dereference in svc_rqst_free() When alloc_pages_node() returns null in svc_rqst_alloc(), the null rq_scratch_page pointer will be dereferenced when calling put_page() in svc_rqst_free(). Fix it by adding a null check. Addresses-Coverity: ("Dereference after null check")
In the Linux kernel, the following vulnerability has been resolved: bcache: avoid oversized read request in cache missing code path In the cache missing code path of cached device, if a proper location from the internal B+ tree is matched for a cache miss range, function cached_dev_cache_miss() will be called in cache_lookup_fn() in the following code block, [code block 1] 526 unsigned int sectors = KEY_INODE(k) == s->iop.inode 527 ? min_t(uint64_t, INT_MAX, 528 KEY_START(k) - bio->bi_iter.bi_sector) 529 : INT_MAX; 530 int ret = s->d->cache_miss(b, s, bio, sectors); Here s->d->cache_miss() is the call backfunction pointer initialized as cached_dev_cache_miss(), the last parameter 'sectors' is an important hint to calculate the size of read request to backing device of the missing cache data. Current calculation in above code block may generate oversized value of 'sectors', which consequently may trigger 2 different potential kernel panics by BUG() or BUG_ON() as listed below, 1) BUG_ON() inside bch_btree_insert_key(), [code block 2] 886 BUG_ON(b->ops->is_extents && !KEY_SIZE(k)); 2) BUG() inside biovec_slab(), [code block 3] 51 default: 52 BUG(); 53 return NULL; All the above panics are original from cached_dev_cache_miss() by the oversized parameter 'sectors'. Inside cached_dev_cache_miss(), parameter 'sectors' is used to calculate the size of data read from backing device for the cache missing. This size is stored in s->insert_bio_sectors by the following lines of code, [code block 4] 909 s->insert_bio_sectors = min(sectors, bio_sectors(bio) + reada); Then the actual key inserting to the internal B+ tree is generated and stored in s->iop.replace_key by the following lines of code, [code block 5] 911 s->iop.replace_key = KEY(s->iop.inode, 912 bio->bi_iter.bi_sector + s->insert_bio_sectors, 913 s->insert_bio_sectors); The oversized parameter 'sectors' may trigger panic 1) by BUG_ON() from the above code block. And the bio sending to backing device for the missing data is allocated with hint from s->insert_bio_sectors by the following lines of code, [code block 6] 926 cache_bio = bio_alloc_bioset(GFP_NOWAIT, 927 DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS), 928 &dc->disk.bio_split); The oversized parameter 'sectors' may trigger panic 2) by BUG() from the agove code block. Now let me explain how the panics happen with the oversized 'sectors'. In code block 5, replace_key is generated by macro KEY(). From the definition of macro KEY(), [code block 7] 71 #define KEY(inode, offset, size) \ 72 ((struct bkey) { \ 73 .high = (1ULL << 63) | ((__u64) (size) << 20) | (inode), \ 74 .low = (offset) \ 75 }) Here 'size' is 16bits width embedded in 64bits member 'high' of struct bkey. But in code block 1, if "KEY_START(k) - bio->bi_iter.bi_sector" is very probably to be larger than (1<<16) - 1, which makes the bkey size calculation in code block 5 is overflowed. In one bug report the value of parameter 'sectors' is 131072 (= 1 << 17), the overflowed 'sectors' results the overflowed s->insert_bio_sectors in code block 4, then makes size field of s->iop.replace_key to be 0 in code block 5. Then the 0- sized s->iop.replace_key is inserted into the internal B+ tree as cache missing check key (a special key to detect and avoid a racing between normal write request and cache missing read request) as, [code block 8] 915 ret = bch_btree_insert_check_key(b, &s->op, &s->iop.replace_key); Then the 0-sized s->iop.replace_key as 3rd parameter triggers the bkey size check BUG_ON() in code block 2, and causes the kernel panic 1). Another ke ---truncated---
In the Linux kernel, the following vulnerability has been resolved: usb: chipidea: ci_hdrc_imx: Also search for 'phys' phandle When passing 'phys' in the devicetree to describe the USB PHY phandle (which is the recommended way according to Documentation/devicetree/bindings/usb/ci-hdrc-usb2.txt) the following NULL pointer dereference is observed on i.MX7 and i.MX8MM: [ 1.489344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 [ 1.498170] Mem abort info: [ 1.500966] ESR = 0x96000044 [ 1.504030] EC = 0x25: DABT (current EL), IL = 32 bits [ 1.509356] SET = 0, FnV = 0 [ 1.512416] EA = 0, S1PTW = 0 [ 1.515569] FSC = 0x04: level 0 translation fault [ 1.520458] Data abort info: [ 1.523349] ISV = 0, ISS = 0x00000044 [ 1.527196] CM = 0, WnR = 1 [ 1.530176] [0000000000000098] user address but active_mm is swapper [ 1.536544] Internal error: Oops: 96000044 [#1] PREEMPT SMP [ 1.542125] Modules linked in: [ 1.545190] CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.14.0-dirty #3 [ 1.551901] Hardware name: Kontron i.MX8MM N801X S (DT) [ 1.557133] Workqueue: events_unbound deferred_probe_work_func [ 1.562984] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) [ 1.568998] pc : imx7d_charger_detection+0x3f0/0x510 [ 1.573973] lr : imx7d_charger_detection+0x22c/0x510 This happens because the charger functions check for the phy presence inside the imx_usbmisc_data structure (data->usb_phy), but the chipidea core populates the usb_phy passed via 'phys' inside 'struct ci_hdrc' (ci->usb_phy) instead. This causes the NULL pointer dereference inside imx7d_charger_detection(). Fix it by also searching for 'phys' in case 'fsl,usbphy' is not found. Tested on a imx7s-warp board.
In the Linux kernel, the following vulnerability has been resolved: usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler Patch fixes the following critical issue caused by deadlock which has been detected during testing NCM class: smp: csd: Detected non-responsive CSD lock (#1) on CPU#0 smp: csd: CSD lock (#1) unresponsive. .... RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0 RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002 RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658 RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658 R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0 Call Trace: <IRQ> do_raw_spin_lock+0xc0/0xd0 _raw_spin_lock_irqsave+0x95/0xa0 cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci] usb_ep_queue+0x35/0x110 eth_start_xmit+0x220/0x3d0 [u_ether] ncm_tx_timeout+0x34/0x40 [usb_f_ncm] ? ncm_free_inst+0x50/0x50 [usb_f_ncm] __hrtimer_run_queues+0xac/0x440 hrtimer_run_softirq+0x8c/0xb0 __do_softirq+0xcf/0x428 asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x61/0x70 irq_exit_rcu+0xc1/0xd0 sysvec_apic_timer_interrupt+0x52/0xb0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:do_raw_spin_trylock+0x18/0x40 RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658 RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658 R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000 _raw_spin_lock+0x3d/0x70 ? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci] cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci] ? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci] ? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci] ? irq_thread+0xa0/0x1c0 irq_thread_fn+0x28/0x60 irq_thread+0x105/0x1c0 ? __kthread_parkme+0x42/0x90 ? irq_forced_thread_fn+0x90/0x90 ? wake_threads_waitq+0x30/0x30 ? irq_thread_check_affinity+0xe0/0xe0 kthread+0x12a/0x160 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 The root cause of issue is spin_lock/spin_unlock instruction instead spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler function.