The inode double locking code in fs/ocfs2/file.c in the Linux kernel 2.6.30 before 2.6.30-rc3, 2.6.27 before 2.6.27.24, 2.6.29 before 2.6.29.4, and possibly other versions down to 2.6.19 allows local users to cause a denial of service (prevention of file creation and removal) via a series of splice system calls that trigger a deadlock between the generic_file_splice_write, splice_from_pipe, and ocfs2_file_splice_write functions.
In the Linux kernel through 5.4.6, there is a NULL pointer dereference in drivers/scsi/libsas/sas_discover.c because of mishandling of port disconnection during discovery, related to a PHY down race condition, aka CID-f70267f379b5.
In the Linux kernel, the following vulnerability has been resolved: pds_core: Prevent race issues involving the adminq There are multiple paths that can result in using the pdsc's adminq. [1] pdsc_adminq_isr and the resulting work from queue_work(), i.e. pdsc_work_thread()->pdsc_process_adminq() [2] pdsc_adminq_post() When the device goes through reset via PCIe reset and/or a fw_down/fw_up cycle due to bad PCIe state or bad device state the adminq is destroyed and recreated. A NULL pointer dereference can happen if [1] or [2] happens after the adminq is already destroyed. In order to fix this, add some further state checks and implement reference counting for adminq uses. Reference counting was used because multiple threads can attempt to access the adminq at the same time via [1] or [2]. Additionally, multiple clients (i.e. pds-vfio-pci) can be using [2] at the same time. The adminq_refcnt is initialized to 1 when the adminq has been allocated and is ready to use. Users/clients of the adminq (i.e. [1] and [2]) will increment the refcnt when they are using the adminq. When the driver goes into a fw_down cycle it will set the PDSC_S_FW_DEAD bit and then wait for the adminq_refcnt to hit 1. Setting the PDSC_S_FW_DEAD before waiting will prevent any further adminq_refcnt increments. Waiting for the adminq_refcnt to hit 1 allows for any current users of the adminq to finish before the driver frees the adminq. Once the adminq_refcnt hits 1 the driver clears the refcnt to signify that the adminq is deleted and cannot be used. On the fw_up cycle the driver will once again initialize the adminq_refcnt to 1 allowing the adminq to be used again.
An issue was discovered in the Linux kernel 4.18 through 5.6.11 when unprivileged user namespaces are allowed. A user can create their own PID namespace, and mount a FUSE filesystem. Upon interaction with this FUSE filesystem, if the userspace component is terminated via a kill of the PID namespace's pid 1, it will result in a hung task, and resources being permanently locked up until system reboot. This can result in resource exhaustion.
A memory leak in the mwifiex_pcie_alloc_cmdrsp_buf() function in drivers/net/wireless/marvell/mwifiex/pcie.c in the Linux kernel through 5.3.11 allows attackers to cause a denial of service (memory consumption) by triggering mwifiex_map_pci_memory() failures, aka CID-db8fd2cde932.
A memory leak in the cx23888_ir_probe() function in drivers/media/pci/cx23885/cx23888-ir.c in the Linux kernel through 5.3.11 allows attackers to cause a denial of service (memory consumption) by triggering kfifo_alloc() failures, aka CID-a7b2df76b42b.
A null pointer dereference flaw was found in the Linux kernel API for the cryptographic algorithm scatterwalk functionality. This issue occurs when a user constructs a malicious packet with specific socket configuration, which could allow a local user to crash the system or escalate their privileges on the system.
A memory leak in the alloc_sgtable() function in drivers/net/wireless/intel/iwlwifi/fw/dbg.c in the Linux kernel through 5.3.11 allows attackers to cause a denial of service (memory consumption) by triggering alloc_page() failures, aka CID-b4b814fec1a5.
A memory leak in the crypto_report() function in crypto/crypto_user_base.c in the Linux kernel through 5.3.11 allows attackers to cause a denial of service (memory consumption) by triggering crypto_report_alg() failures, aka CID-ffdde5932042.
Multiple memory leaks in the iwl_pcie_ctxt_info_gen3_init() function in drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c in the Linux kernel through 5.3.11 allow attackers to cause a denial of service (memory consumption) by triggering iwl_pcie_init_fw_sec() or dma_alloc_coherent() failures, aka CID-0f4f199443fa.
In the Linux kernel, the following vulnerability has been resolved: x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race The SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC page for an enclave and set secs.epc_page to NULL. The SECS page is used for EAUG and ELDU in the SGX page fault handler. However, the NULL check for secs.epc_page is only done for ELDU, not EAUG before being used. Fix this by doing the same NULL check and reloading of the SECS page as needed for both EAUG and ELDU. The SECS page holds global enclave metadata. It can only be reclaimed when there are no other enclave pages remaining. At that point, virtually nothing can be done with the enclave until the SECS page is paged back in. An enclave can not run nor generate page faults without a resident SECS page. But it is still possible for a #PF for a non-SECS page to race with paging out the SECS page: when the last resident non-SECS page A triggers a #PF in a non-resident page B, and then page A and the SECS both are paged out before the #PF on B is handled. Hitting this bug requires that race triggered with a #PF for EAUG. Following is a trace when it happens. BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 Call Trace: ? __kmem_cache_alloc_node+0x16a/0x440 ? xa_load+0x6e/0xa0 sgx_vma_fault+0x119/0x230 __do_fault+0x36/0x140 do_fault+0x12f/0x400 __handle_mm_fault+0x728/0x1110 handle_mm_fault+0x105/0x310 do_user_addr_fault+0x1ee/0x750 ? __this_cpu_preempt_check+0x13/0x20 exc_page_fault+0x76/0x180 asm_exc_page_fault+0x27/0x30
Memory leaks in *create_resource_pool() functions under drivers/gpu/drm/amd/display/dc in the Linux kernel through 5.3.11 allow attackers to cause a denial of service (memory consumption). This affects the dce120_create_resource_pool() function in drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c, the dce110_create_resource_pool() function in drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c, the dce100_create_resource_pool() function in drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c, the dcn10_create_resource_pool() function in drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c, and the dce112_create_resource_pool() function in drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c, aka CID-104c307147ad.
In the Linux kernel, the following vulnerability has been resolved: phy: lynx-28g: serialize concurrent phy_set_mode_ext() calls to shared registers The protocol converter configuration registers PCC8, PCCC, PCCD (implemented by the driver), as well as others, control protocol converters from multiple lanes (each represented as a different struct phy). So, if there are simultaneous calls to phy_set_mode_ext() to lanes sharing the same PCC register (either for the "old" or for the "new" protocol), corruption of the values programmed to hardware is possible, because lynx_28g_rmw() has no locking. Add a spinlock in the struct lynx_28g_priv shared by all lanes, and take the global spinlock from the phy_ops :: set_mode() implementation. There are no other callers which modify PCC registers.
In the Linux kernel, the following vulnerability has been resolved: nfsd: fix handling of cached open files in nfsd4_open codepath Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") added the ability to cache an open fd over a compound. There are a couple of problems with the way this currently works: It's racy, as a newly-created nfsd_file can end up with its PENDING bit cleared while the nf is hashed, and the nf_file pointer is still zeroed out. Other tasks can find it in this state and they expect to see a valid nf_file, and can oops if nf_file is NULL. Also, there is no guarantee that we'll end up creating a new nfsd_file if one is already in the hash. If an extant entry is in the hash with a valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with the value of op_file and the old nf_file will leak. Fix both issues by making a new nfsd_file_acquirei_opened variant that takes an optional file pointer. If one is present when this is called, we'll take a new reference to it instead of trying to open the file. If the nfsd_file already has a valid nf_file, we'll just ignore the optional file and pass the nfsd_file back as-is. Also rework the tracepoints a bit to allow for an "opened" variant and don't try to avoid counting acquisitions in the case where we already have a cached open file.
In the Linux kernel, the following vulnerability has been resolved: HID: logitech-hidpp: Fix kernel crash on receiver USB disconnect hidpp_connect_event() has *four* time-of-check vs time-of-use (TOCTOU) races when it races with itself. hidpp_connect_event() primarily runs from a workqueue but it also runs on probe() and if a "device-connected" packet is received by the hw when the thread running hidpp_connect_event() from probe() is waiting on the hw, then a second thread running hidpp_connect_event() will be started from the workqueue. This opens the following races (note the below code is simplified): 1. Retrieving + printing the protocol (harmless race): if (!hidpp->protocol_major) { hidpp_root_get_protocol_version() hidpp->protocol_major = response.rap.params[0]; } We can actually see this race hit in the dmesg in the abrt output attached to rhbz#2227968: [ 3064.624215] logitech-hidpp-device 0003:046D:4071.0049: HID++ 4.5 device connected. [ 3064.658184] logitech-hidpp-device 0003:046D:4071.0049: HID++ 4.5 device connected. Testing with extra logging added has shown that after this the 2 threads take turn grabbing the hw access mutex (send_mutex) so they ping-pong through all the other TOCTOU cases managing to hit all of them: 2. Updating the name to the HIDPP name (harmless race): if (hidpp->name == hdev->name) { ... hidpp->name = new_name; } 3. Initializing the power_supply class for the battery (problematic!): hidpp_initialize_battery() { if (hidpp->battery.ps) return 0; probe_battery(); /* Blocks, threads take turns executing this */ hidpp->battery.desc.properties = devm_kmemdup(dev, hidpp_battery_props, cnt, GFP_KERNEL); hidpp->battery.ps = devm_power_supply_register(&hidpp->hid_dev->dev, &hidpp->battery.desc, cfg); } 4. Creating delayed input_device (potentially problematic): if (hidpp->delayed_input) return; hidpp->delayed_input = hidpp_allocate_input(hdev); The really big problem here is 3. Hitting the race leads to the following sequence: hidpp->battery.desc.properties = devm_kmemdup(dev, hidpp_battery_props, cnt, GFP_KERNEL); hidpp->battery.ps = devm_power_supply_register(&hidpp->hid_dev->dev, &hidpp->battery.desc, cfg); ... hidpp->battery.desc.properties = devm_kmemdup(dev, hidpp_battery_props, cnt, GFP_KERNEL); hidpp->battery.ps = devm_power_supply_register(&hidpp->hid_dev->dev, &hidpp->battery.desc, cfg); So now we have registered 2 power supplies for the same battery, which looks a bit weird from userspace's pov but this is not even the really big problem. Notice how: 1. This is all devm-maganaged 2. The hidpp->battery.desc struct is shared between the 2 power supplies 3. hidpp->battery.desc.properties points to the result from the second devm_kmemdup() This causes a use after free scenario on USB disconnect of the receiver: 1. The last registered power supply class device gets unregistered 2. The memory from the last devm_kmemdup() call gets freed, hidpp->battery.desc.properties now points to freed memory 3. The first registered power supply class device gets unregistered, this involves sending a remove uevent to userspace which invokes power_supply_uevent() to fill the uevent data 4. power_supply_uevent() uses hidpp->battery.desc.properties which now points to freed memory leading to backtraces like this one: Sep 22 20:01:35 eric kernel: BUG: unable to handle page fault for address: ffffb2140e017f08 ... Sep 22 20:01:35 eric kernel: Workqueue: usb_hub_wq hub_event Sep 22 20:01:35 eric kernel: RIP: 0010:power_supply_uevent+0xee/0x1d0 ... Sep 22 20:01:35 eric kernel: ? asm_exc_page_fault+0x26/0x30 Sep 22 20:01:35 eric kernel: ? power_supply_uevent+0xee/0x1d0 Sep 22 20:01:35 eric kernel: ? power_supply_uevent+0x10d/0x1d0 Sep 22 20:01:35 eric kernel: dev_uevent+0x10f/0x2d0 Sep 22 20:01:35 eric kernel: kobject_uevent_env+0x291/0x680 Sep 22 20:01:35 eric kernel: ---truncated---
In the Linux kernel, the following vulnerability has been resolved: xhci: Fix null pointer dereference when host dies Make sure xhci_free_dev() and xhci_kill_endpoint_urbs() do not race and cause null pointer dereference when host suddenly dies. Usb core may call xhci_free_dev() which frees the xhci->devs[slot_id] virt device at the same time that xhci_kill_endpoint_urbs() tries to loop through all the device's endpoints, checking if there are any cancelled urbs left to give back. hold the xhci spinlock while freeing the virt device
In the Linux kernel before 5.0, a memory leak exists in sit_init_net() in net/ipv6/sit.c when register_netdev() fails to register sitn->fb_tunnel_dev, which may cause denial of service, aka CID-07f12b26e21a.
drivers/gpu/drm/radeon/radeon_display.c in the Linux kernel 5.2.14 does not check the alloc_workqueue return value, leading to a NULL pointer dereference. NOTE: A third-party software maintainer states that the work queue allocation is happening during device initialization, which for a graphics card occurs during boot. It is not attacker controllable and OOM at that time is highly unlikely
An issue was discovered in the Linux kernel before 5.0.6. There is a memory leak issue when idr_alloc() fails in genl_register_family() in net/netlink/genetlink.c.
An issue was discovered in the Linux kernel before 5.0.9. There is a use-after-free in atalk_proc_exit, related to net/appletalk/atalk_proc.c, net/appletalk/ddp.c, and net/appletalk/sysctl_net_atalk.c.
drivers/net/wireless/intel/iwlwifi/pcie/trans.c in the Linux kernel 5.2.14 does not check the alloc_workqueue return value, leading to a NULL pointer dereference.
In the Linux kernel before 5.1.13, there is a memory leak in drivers/scsi/libsas/sas_expander.c when SAS expander discovery fails. This will cause a BUG and denial of service.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix race between quota rescan and disable leading to NULL pointer deref If we have one task trying to start the quota rescan worker while another one is trying to disable quotas, we can end up hitting a race that results in the quota rescan worker doing a NULL pointer dereference. The steps for this are the following: 1) Quotas are enabled; 2) Task A calls the quota rescan ioctl and enters btrfs_qgroup_rescan(). It calls qgroup_rescan_init() which returns 0 (success) and then joins a transaction and commits it; 3) Task B calls the quota disable ioctl and enters btrfs_quota_disable(). It clears the bit BTRFS_FS_QUOTA_ENABLED from fs_info->flags and calls btrfs_qgroup_wait_for_completion(), which returns immediately since the rescan worker is not yet running. Then it starts a transaction and locks fs_info->qgroup_ioctl_lock; 4) Task A queues the rescan worker, by calling btrfs_queue_work(); 5) The rescan worker starts, and calls rescan_should_stop() at the start of its while loop, which results in 0 iterations of the loop, since the flag BTRFS_FS_QUOTA_ENABLED was cleared from fs_info->flags by task B at step 3); 6) Task B sets fs_info->quota_root to NULL; 7) The rescan worker tries to start a transaction and uses fs_info->quota_root as the root argument for btrfs_start_transaction(). This results in a NULL pointer dereference down the call chain of btrfs_start_transaction(). The stack trace is something like the one reported in Link tag below: general protection fault, probably for non-canonical address 0xdffffc0000000041: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f] CPU: 1 PID: 34 Comm: kworker/u4:2 Not tainted 6.1.0-syzkaller-13872-gb6bb9676f216 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Workqueue: btrfs-qgroup-rescan btrfs_work_helper RIP: 0010:start_transaction+0x48/0x10f0 fs/btrfs/transaction.c:564 Code: 48 89 fb 48 (...) RSP: 0018:ffffc90000ab7ab0 EFLAGS: 00010206 RAX: 0000000000000041 RBX: 0000000000000208 RCX: ffff88801779ba80 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: dffffc0000000000 R08: 0000000000000001 R09: fffff52000156f5d R10: fffff52000156f5d R11: 1ffff92000156f5c R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2bea75b718 CR3: 000000001d0cc000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> btrfs_qgroup_rescan_worker+0x3bb/0x6a0 fs/btrfs/qgroup.c:3402 btrfs_work_helper+0x312/0x850 fs/btrfs/async-thread.c:280 process_one_work+0x877/0xdb0 kernel/workqueue.c:2289 worker_thread+0xb14/0x1330 kernel/workqueue.c:2436 kthread+0x266/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> Modules linked in: So fix this by having the rescan worker function not attempt to start a transaction if it didn't do any rescan work.
In the Linux kernel, the following vulnerability has been resolved: tracing: Have format file honor EVENT_FILE_FL_FREED When eventfs was introduced, special care had to be done to coordinate the freeing of the file meta data with the files that are exposed to user space. The file meta data would have a ref count that is set when the file is created and would be decremented and freed after the last user that opened the file closed it. When the file meta data was to be freed, it would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed, and any new references made (like new opens or reads) would fail as it is marked freed. This allowed other meta data to be freed after this flag was set (under the event_mutex). All the files that were dynamically created in the events directory had a pointer to the file meta data and would call event_release() when the last reference to the user space file was closed. This would be the time that it is safe to free the file meta data. A shortcut was made for the "format" file. It's i_private would point to the "call" entry directly and not point to the file's meta data. This is because all format files are the same for the same "call", so it was thought there was no reason to differentiate them. The other files maintain state (like the "enable", "trigger", etc). But this meant if the file were to disappear, the "format" file would be unaware of it. This caused a race that could be trigger via the user_events test (that would create dynamic events and free them), and running a loop that would read the user_events format files: In one console run: # cd tools/testing/selftests/user_events # while true; do ./ftrace_test; done And in another console run: # cd /sys/kernel/tracing/ # while true; do cat events/user_events/__test_event/format; done 2>/dev/null With KASAN memory checking, it would trigger a use-after-free bug report (which was a real bug). This was because the format file was not checking the file's meta data flag "EVENT_FILE_FL_FREED", so it would access the event that the file meta data pointed to after the event was freed. After inspection, there are other locations that were found to not check the EVENT_FILE_FL_FREED flag when accessing the trace_event_file. Add a new helper function: event_file_file() that will make sure that the event_mutex is held, and will return NULL if the trace_event_file has the EVENT_FILE_FL_FREED flag set. Have the first reference of the struct file pointer use event_file_file() and check for NULL. Later uses can still use the event_file_data() helper function if the event_mutex is still held and was not released since the event_file_file() call.
In the Linux kernel, the following vulnerability has been resolved: ice: Don't process extts if PTP is disabled The ice_ptp_extts_event() function can race with ice_ptp_release() and result in a NULL pointer dereference which leads to a kernel panic. Panic occurs because the ice_ptp_extts_event() function calls ptp_clock_event() with a NULL pointer. The ice driver has already released the PTP clock by the time the interrupt for the next external timestamp event occurs. To fix this, modify the ice_ptp_extts_event() function to check the PTP state and bail early if PTP is not ready.
A use-after-free flaw was found in ndlc_remove in drivers/nfc/st-nci/ndlc.c in the Linux Kernel. This flaw could allow an attacker to crash the system due to a race problem.
A use-after-free flaw was found in io_uring/poll.c in io_poll_check_events in the io_uring subcomponent in the Linux Kernel due to a race condition of poll_refs. This flaw may cause a NULL pointer dereference.
A use-after-free flaw was found in qdisc_graft in net/sched/sch_api.c in the Linux Kernel due to a race problem. This flaw leads to a denial of service issue. If patch ebda44da44f6 ("net: sched: fix race condition in qdisc_graft()") not applied yet, then kernel could be affected.
In the Linux kernel, the following vulnerability has been resolved: i2c: sprd: fix reference leak when pm_runtime_get_sync fails The PM reference count is not expected to be incremented on return in sprd_i2c_master_xfer() and sprd_i2c_remove(). However, pm_runtime_get_sync will increment the PM reference count even failed. Forgetting to putting operation will result in a reference leak here. Replace it with pm_runtime_resume_and_get to keep usage counter balanced.
In the Linux kernel, the following vulnerability has been resolved: drm/virtio: Fix GEM handle creation UAF Userspace can guess the handle value and try to race GEM object creation with handle close, resulting in a use-after-free if we dereference the object after dropping the handle's reference. For that reason, dropping the handle's reference must be done *after* we are done dereferencing the object.
In the Linux kernel, the following vulnerability has been resolved: nbd: call genl_unregister_family() first in nbd_cleanup() Otherwise there may be race between module removal and the handling of netlink command, which can lead to the oops as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000098 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 31299 Comm: nbd-client Tainted: G E 5.14.0-rc4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:down_write+0x1a/0x50 Call Trace: start_creating+0x89/0x130 debugfs_create_dir+0x1b/0x130 nbd_start_device+0x13d/0x390 [nbd] nbd_genl_connect+0x42f/0x748 [nbd] genl_family_rcv_msg_doit.isra.0+0xec/0x150 genl_rcv_msg+0xe5/0x1e0 netlink_rcv_skb+0x55/0x100 genl_rcv+0x29/0x40 netlink_unicast+0x1a8/0x250 netlink_sendmsg+0x21b/0x430 ____sys_sendmsg+0x2a4/0x2d0 ___sys_sendmsg+0x81/0xc0 __sys_sendmsg+0x62/0xb0 __x64_sys_sendmsg+0x1f/0x30 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: nbd(E-)
In the Linux kernel, the following vulnerability has been resolved: char: tpm: Protect tpm_pm_suspend with locks Currently tpm transactions are executed unconditionally in tpm_pm_suspend() function, which may lead to races with other tpm accessors in the system. Specifically, the hw_random tpm driver makes use of tpm_get_random(), and this function is called in a loop from a kthread, which means it's not frozen alongside userspace, and so can race with the work done during system suspend: tpm tpm0: tpm_transmit: tpm_recv: error -52 tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics CPU: 0 PID: 1 Comm: init Not tainted 6.1.0-rc5+ #135 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20220807_005459-localhost 04/01/2014 Call Trace: tpm_tis_status.cold+0x19/0x20 tpm_transmit+0x13b/0x390 tpm_transmit_cmd+0x20/0x80 tpm1_pm_suspend+0xa6/0x110 tpm_pm_suspend+0x53/0x80 __pnp_bus_suspend+0x35/0xe0 __device_suspend+0x10f/0x350 Fix this by calling tpm_try_get_ops(), which itself is a wrapper around tpm_chip_start(), but takes the appropriate mutex. [Jason: reworked commit message, added metadata]
In the Linux kernel before 6.1.13, there is a double free in net/mpls/af_mpls.c upon an allocation failure (for registering the sysctl table under a new location) during the renaming of a device.
In the Linux kernel, the following vulnerability has been resolved: XArray: Fix xas_create_range() when multi-order entry present If there is already an entry present that is of order >= XA_CHUNK_SHIFT when we call xas_create_range(), xas_create_range() will misinterpret that entry as a node and dereference xa_node->parent, generally leading to a crash that looks something like this: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 32 Comm: khugepaged Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13 #0 RIP: 0010:xa_parent_locked include/linux/xarray.h:1207 [inline] RIP: 0010:xas_create_range+0x2d9/0x6e0 lib/xarray.c:725 It's deterministically reproducable once you know what the problem is, but producing it in a live kernel requires khugepaged to hit a race. While the problem has been present since xas_create_range() was introduced, I'm not aware of a way to hit it before the page cache was converted to use multi-index entries.
In the Linux kernel, the following vulnerability has been resolved: net: mana: Fix race on per-CQ variable napi work_done After calling napi_complete_done(), the NAPIF_STATE_SCHED bit may be cleared, and another CPU can start napi thread and access per-CQ variable, cq->work_done. If the other thread (for example, from busy_poll) sets it to a value >= budget, this thread will continue to run when it should stop, and cause memory corruption and panic. To fix this issue, save the per-CQ work_done variable in a local variable before napi_complete_done(), so it won't be corrupted by a possible concurrent thread after napi_complete_done(). Also, add a flag bit to advertise to the NIC firmware: the NAPI work_done variable race is fixed, so the driver is able to reliably support features like busy_poll.
In the Linux kernel, the following vulnerability has been resolved: net/smc: Transitional solution for clcsock race issue We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace: <TASK> __sys_setsockopt+0xfc/0x190 __x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK> This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released before access. In case that a crash of the same reason happens in smc_getsockopt() or smc_switch_to_fallback(), this patch also checkes smc->clcsock in them too. And the caller of smc_switch_to_fallback() will identify whether fallback succeeds according to the return value.
In the Linux kernel, the following vulnerability has been resolved: USB: gadgetfs: Fix race between mounting and unmounting The syzbot fuzzer and Gerald Lee have identified a use-after-free bug in the gadgetfs driver, involving processes concurrently mounting and unmounting the gadgetfs filesystem. In particular, gadgetfs_fill_super() can race with gadgetfs_kill_sb(), causing the latter to deallocate the_device while the former is using it. The output from KASAN says, in part: BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:102 [inline] BUG: KASAN: use-after-free in atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline] BUG: KASAN: use-after-free in __refcount_sub_and_test include/linux/refcount.h:272 [inline] BUG: KASAN: use-after-free in __refcount_dec_and_test include/linux/refcount.h:315 [inline] BUG: KASAN: use-after-free in refcount_dec_and_test include/linux/refcount.h:333 [inline] BUG: KASAN: use-after-free in put_dev drivers/usb/gadget/legacy/inode.c:159 [inline] BUG: KASAN: use-after-free in gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086 Write of size 4 at addr ffff8880276d7840 by task syz-executor126/18689 CPU: 0 PID: 18689 Comm: syz-executor126 Not tainted 6.1.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> ... atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline] __refcount_sub_and_test include/linux/refcount.h:272 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] put_dev drivers/usb/gadget/legacy/inode.c:159 [inline] gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 vfs_get_super fs/super.c:1190 [inline] get_tree_single+0xd0/0x160 fs/super.c:1207 vfs_get_tree+0x88/0x270 fs/super.c:1531 vfs_fsconfig_locked fs/fsopen.c:232 [inline] The simplest solution is to ensure that gadgetfs_fill_super() and gadgetfs_kill_sb() are serialized by making them both acquire a new mutex.
In the Linux kernel, the following vulnerability has been resolved: scsi: qla2xxx: Fix memory leak in __qlt_24xx_handle_abts() Commit 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") made the __qlt_24xx_handle_abts() function return early if tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean up the allocated memory for the management command.
A vulnerability was found in Linux kernel, where a use-after-frees in nouveau's postclose() handler could happen if removing device (that is not common to remove video card physically without power-off, but same happens if "unbind" the driver).
A use-after-free flaw was found in the Linux kernel's Memory Management subsystem when a user wins two races at the same time with a fail in the mas_prev_slot function. This issue could allow a local user to crash the system.
In the Linux kernel, the following vulnerability has been resolved: btrfs: qgroup: do not warn on record without old_roots populated [BUG] There are some reports from the mailing list that since v6.1 kernel, the WARN_ON() inside btrfs_qgroup_account_extent() gets triggered during rescan: WARNING: CPU: 3 PID: 6424 at fs/btrfs/qgroup.c:2756 btrfs_qgroup_account_extents+0x1ae/0x260 [btrfs] CPU: 3 PID: 6424 Comm: snapperd Tainted: P OE 6.1.2-1-default #1 openSUSE Tumbleweed 05c7a1b1b61d5627475528f71f50444637b5aad7 RIP: 0010:btrfs_qgroup_account_extents+0x1ae/0x260 [btrfs] Call Trace: <TASK> btrfs_commit_transaction+0x30c/0xb40 [btrfs c39c9c546c241c593f03bd6d5f39ea1b676250f6] ? start_transaction+0xc3/0x5b0 [btrfs c39c9c546c241c593f03bd6d5f39ea1b676250f6] btrfs_qgroup_rescan+0x42/0xc0 [btrfs c39c9c546c241c593f03bd6d5f39ea1b676250f6] btrfs_ioctl+0x1ab9/0x25c0 [btrfs c39c9c546c241c593f03bd6d5f39ea1b676250f6] ? __rseq_handle_notify_resume+0xa9/0x4a0 ? mntput_no_expire+0x4a/0x240 ? __seccomp_filter+0x319/0x4d0 __x64_sys_ioctl+0x90/0xd0 do_syscall_64+0x5b/0x80 ? syscall_exit_to_user_mode+0x17/0x40 ? do_syscall_64+0x67/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd9b790d9bf </TASK> [CAUSE] Since commit e15e9f43c7ca ("btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting"), if our qgroup is already in inconsistent state, we will no longer do the time-consuming backref walk. This can leave some qgroup records without a valid old_roots ulist. Normally this is fine, as btrfs_qgroup_account_extents() would also skip those records if we have NO_ACCOUNTING flag set. But there is a small window, if we have NO_ACCOUNTING flag set, and inserted some qgroup_record without a old_roots ulist, but then the user triggered a qgroup rescan. During btrfs_qgroup_rescan(), we firstly clear NO_ACCOUNTING flag, then commit current transaction. And since we have a qgroup_record with old_roots = NULL, we trigger the WARN_ON() during btrfs_qgroup_account_extents(). [FIX] Unfortunately due to the introduction of NO_ACCOUNTING flag, the assumption that every qgroup_record would have its old_roots populated is no longer correct. Fix the false alerts and drop the WARN_ON().
Memory leaks in *clock_source_create() functions under drivers/gpu/drm/amd/display/dc in the Linux kernel before 5.3.8 allow attackers to cause a denial of service (memory consumption). This affects the dce112_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c, the dce100_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c, the dcn10_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c, the dcn20_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c, the dce120_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c, the dce110_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c, and the dce80_clock_source_create() function in drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c, aka CID-055e547478a1.
Arm guests can cause Dom0 DoS via PV devices When mapping pages of guests on Arm, dom0 is using an rbtree to keep track of the foreign mappings. Updating of that rbtree is not always done completely with the related lock held, resulting in a small race window, which can be used by unprivileged guests via PV devices to cause inconsistencies of the rbtree. These inconsistencies can lead to Denial of Service (DoS) of dom0, e.g. by causing crashes or the inability to perform further mappings of other guests' memory pages.
In the Linux kernel, the following vulnerability has been resolved: drm/bridge: sii902x: Fix probing race issue A null pointer dereference crash has been observed rarely on TI platforms using sii9022 bridge: [ 53.271356] sii902x_get_edid+0x34/0x70 [sii902x] [ 53.276066] sii902x_bridge_get_edid+0x14/0x20 [sii902x] [ 53.281381] drm_bridge_get_edid+0x20/0x34 [drm] [ 53.286305] drm_bridge_connector_get_modes+0x8c/0xcc [drm_kms_helper] [ 53.292955] drm_helper_probe_single_connector_modes+0x190/0x538 [drm_kms_helper] [ 53.300510] drm_client_modeset_probe+0x1f0/0xbd4 [drm] [ 53.305958] __drm_fb_helper_initial_config_and_unlock+0x50/0x510 [drm_kms_helper] [ 53.313611] drm_fb_helper_initial_config+0x48/0x58 [drm_kms_helper] [ 53.320039] drm_fbdev_dma_client_hotplug+0x84/0xd4 [drm_dma_helper] [ 53.326401] drm_client_register+0x5c/0xa0 [drm] [ 53.331216] drm_fbdev_dma_setup+0xc8/0x13c [drm_dma_helper] [ 53.336881] tidss_probe+0x128/0x264 [tidss] [ 53.341174] platform_probe+0x68/0xc4 [ 53.344841] really_probe+0x188/0x3c4 [ 53.348501] __driver_probe_device+0x7c/0x16c [ 53.352854] driver_probe_device+0x3c/0x10c [ 53.357033] __device_attach_driver+0xbc/0x158 [ 53.361472] bus_for_each_drv+0x88/0xe8 [ 53.365303] __device_attach+0xa0/0x1b4 [ 53.369135] device_initial_probe+0x14/0x20 [ 53.373314] bus_probe_device+0xb0/0xb4 [ 53.377145] deferred_probe_work_func+0xcc/0x124 [ 53.381757] process_one_work+0x1f0/0x518 [ 53.385770] worker_thread+0x1e8/0x3dc [ 53.389519] kthread+0x11c/0x120 [ 53.392750] ret_from_fork+0x10/0x20 The issue here is as follows: - tidss probes, but is deferred as sii902x is still missing. - sii902x starts probing and enters sii902x_init(). - sii902x calls drm_bridge_add(). Now the sii902x bridge is ready from DRM's perspective. - sii902x calls sii902x_audio_codec_init() and platform_device_register_data() - The registration of the audio platform device causes probing of the deferred devices. - tidss probes, which eventually causes sii902x_bridge_get_edid() to be called. - sii902x_bridge_get_edid() tries to use the i2c to read the edid. However, the sii902x driver has not set up the i2c part yet, leading to the crash. Fix this by moving the drm_bridge_add() to the end of the sii902x_init(), which is also at the very end of sii902x_probe().
In the Linux kernel, the following vulnerability has been resolved: firmware: qcom: uefisecapp: fix efivars registration race Since the conversion to using the TZ allocator, the efivars service is registered before the memory pool has been allocated, something which can lead to a NULL-pointer dereference in case of a racing EFI variable access. Make sure that all resources have been set up before registering the efivars.
In the Linux kernel, the following vulnerability has been resolved: platform/x86: lenovo-yoga-tab2-pro-1380-fastcharger: fix serdev race The yt2_1380_fc_serdev_probe() function calls devm_serdev_device_open() before setting the client ops via serdev_device_set_client_ops(). This ordering can trigger a NULL pointer dereference in the serdev controller's receive_buf handler, as it assumes serdev->ops is valid when SERPORT_ACTIVE is set. This is similar to the issue fixed in commit 5e700b384ec1 ("platform/chrome: cros_ec_uart: properly fix race condition") where devm_serdev_device_open() was called before fully initializing the device. Fix the race by ensuring client ops are set before enabling the port via devm_serdev_device_open(). Note, serdev_device_set_baudrate() and serdev_device_set_flow_control() calls should be after the devm_serdev_device_open() call.
In the Linux kernel, the following vulnerability has been resolved: platform/x86: dell-uart-backlight: fix serdev race The dell_uart_bl_serdev_probe() function calls devm_serdev_device_open() before setting the client ops via serdev_device_set_client_ops(). This ordering can trigger a NULL pointer dereference in the serdev controller's receive_buf handler, as it assumes serdev->ops is valid when SERPORT_ACTIVE is set. This is similar to the issue fixed in commit 5e700b384ec1 ("platform/chrome: cros_ec_uart: properly fix race condition") where devm_serdev_device_open() was called before fully initializing the device. Fix the race by ensuring client ops are set before enabling the port via devm_serdev_device_open(). Note, serdev_device_set_baudrate() and serdev_device_set_flow_control() calls should be after the devm_serdev_device_open() call.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix race between transaction aborts and fsyncs leading to use-after-free There is a race between a task aborting a transaction during a commit, a task doing an fsync and the transaction kthread, which leads to an use-after-free of the log root tree. When this happens, it results in a stack trace like the following: BTRFS info (device dm-0): forced readonly BTRFS warning (device dm-0): Skipping commit of aborted transaction. BTRFS: error (device dm-0) in cleanup_transaction:1958: errno=-5 IO failure BTRFS warning (device dm-0): lost page write due to IO error on /dev/mapper/error-test (-5) BTRFS warning (device dm-0): Skipping commit of aborted transaction. BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0xa4e8 len 4096 err no 10 BTRFS error (device dm-0): error writing primary super block to device 1 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e000 len 4096 err no 10 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e008 len 4096 err no 10 BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e010 len 4096 err no 10 BTRFS: error (device dm-0) in write_all_supers:4110: errno=-5 IO failure (1 errors while writing supers) BTRFS: error (device dm-0) in btrfs_sync_log:3308: errno=-5 IO failure general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b68: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 2 PID: 2458471 Comm: fsstress Not tainted 5.12.0-rc5-btrfs-next-84 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x139/0xa40 Code: c0 74 19 (...) RSP: 0018:ffff9f18830d7b00 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b68 RBX: 0000000000000001 RCX: 0000000000000002 RDX: ffffffffb9c54d13 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff9f18830d7bc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff9f18830d7be0 R11: 0000000000000001 R12: ffff8c6cd199c040 R13: ffff8c6c95821358 R14: 00000000fffffffb R15: ffff8c6cbcf01358 FS: 00007fa9140c2b80(0000) GS:ffff8c6fac600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa913d52000 CR3: 000000013d2b4003 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? __btrfs_handle_fs_error+0xde/0x146 [btrfs] ? btrfs_sync_log+0x7c1/0xf20 [btrfs] ? btrfs_sync_log+0x7c1/0xf20 [btrfs] btrfs_sync_log+0x7c1/0xf20 [btrfs] btrfs_sync_file+0x40c/0x580 [btrfs] do_fsync+0x38/0x70 __x64_sys_fsync+0x10/0x20 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fa9142a55c3 Code: 8b 15 09 (...) RSP: 002b:00007fff26278d48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a RAX: ffffffffffffffda RBX: 0000563c83cb4560 RCX: 00007fa9142a55c3 RDX: 00007fff26278cb0 RSI: 00007fff26278cb0 RDI: 0000000000000005 RBP: 0000000000000005 R08: 0000000000000001 R09: 00007fff26278d5c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000340 R13: 00007fff26278de0 R14: 00007fff26278d96 R15: 0000563c83ca57c0 Modules linked in: btrfs dm_zero dm_snapshot dm_thin_pool (...) ---[ end trace ee2f1b19327d791d ]--- The steps that lead to this crash are the following: 1) We are at transaction N; 2) We have two tasks with a transaction handle attached to transaction N. Task A and Task B. Task B is doing an fsync; 3) Task B is at btrfs_sync_log(), and has saved fs_info->log_root_tree into a local variable named 'log_root_tree' at the top of btrfs_sync_log(). Task B is about to call write_all_supers(), but before that... 4) Task A calls btrfs_commit_transaction(), and after it sets the transaction state to TRANS_STATE_COMMIT_START, an error happens before it w ---truncated---
In the Linux kernel, the following vulnerability has been resolved: netfilter: conntrack: serialize hash resizes and cleanups Syzbot was able to trigger the following warning [1] No repro found by syzbot yet but I was able to trigger similar issue by having 2 scripts running in parallel, changing conntrack hash sizes, and: for j in `seq 1 1000` ; do unshare -n /bin/true >/dev/null ; done It would take more than 5 minutes for net_namespace structures to be cleaned up. This is because nf_ct_iterate_cleanup() has to restart everytime a resize happened. By adding a mutex, we can serialize hash resizes and cleanups and also make get_next_corpse() faster by skipping over empty buckets. Even without resizes in the picture, this patch considerably speeds up network namespace dismantles. [1] INFO: task syz-executor.0:8312 can't die for more than 144 seconds. task:syz-executor.0 state:R running task stack:25672 pid: 8312 ppid: 6573 flags:0x00004006 Call Trace: context_switch kernel/sched/core.c:4955 [inline] __schedule+0x940/0x26f0 kernel/sched/core.c:6236 preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:6408 preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:35 __local_bh_enable_ip+0x109/0x120 kernel/softirq.c:390 local_bh_enable include/linux/bottom_half.h:32 [inline] get_next_corpse net/netfilter/nf_conntrack_core.c:2252 [inline] nf_ct_iterate_cleanup+0x15a/0x450 net/netfilter/nf_conntrack_core.c:2275 nf_conntrack_cleanup_net_list+0x14c/0x4f0 net/netfilter/nf_conntrack_core.c:2469 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:171 setup_net+0x639/0xa30 net/core/net_namespace.c:349 copy_net_ns+0x319/0x760 net/core/net_namespace.c:470 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226 ksys_unshare+0x445/0x920 kernel/fork.c:3128 __do_sys_unshare kernel/fork.c:3202 [inline] __se_sys_unshare kernel/fork.c:3200 [inline] __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3200 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f63da68e739 RSP: 002b:00007f63d7c05188 EFLAGS: 00000246 ORIG_RAX: 0000000000000110 RAX: ffffffffffffffda RBX: 00007f63da792f80 RCX: 00007f63da68e739 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000000 RBP: 00007f63da6e8cc4 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f63da792f80 R13: 00007fff50b75d3f R14: 00007f63d7c05300 R15: 0000000000022000 Showing all locks held in the system: 1 lock held by khungtaskd/27: #0: ffffffff8b980020 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446 2 locks held by kworker/u4:2/153: #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1198 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:634 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x896/0x1690 kernel/workqueue.c:2268 #1: ffffc9000140fdb0 ((kfence_timer).work){+.+.}-{0:0}, at: process_one_work+0x8ca/0x1690 kernel/workqueue.c:2272 1 lock held by systemd-udevd/2970: 1 lock held by in:imklog/6258: #0: ffff88807f970ff0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990 3 locks held by kworker/1:6/8158: 1 lock held by syz-executor.0/8312: 2 locks held by kworker/u4:13/9320: 1 lock held by ---truncated---
In the Linux kernel, the following vulnerability has been resolved: xprtrdma: Fix cwnd update ordering After a reconnect, the reply handler is opening the cwnd (and thus enabling more RPC Calls to be sent) /before/ rpcrdma_post_recvs() can post enough Receive WRs to receive their replies. This causes an RNR and the new connection is lost immediately. The race is most clearly exposed when KASAN and disconnect injection are enabled. This slows down rpcrdma_rep_create() enough to allow the send side to post a bunch of RPC Calls before the Receive completion handler can invoke ib_post_recv().