In the Linux kernel, the following vulnerability has been resolved: IB/ipoib: Fix mcast list locking Releasing the `priv->lock` while iterating the `priv->multicast_list` in `ipoib_mcast_join_task()` opens a window for `ipoib_mcast_dev_flush()` to remove the items while in the middle of iteration. If the mcast is removed while the lock was dropped, the for loop spins forever resulting in a hard lockup (as was reported on RHEL 4.18.0-372.75.1.el8_6 kernel): Task A (kworker/u72:2 below) | Task B (kworker/u72:0 below) -----------------------------------+----------------------------------- ipoib_mcast_join_task(work) | ipoib_ib_dev_flush_light(work) spin_lock_irq(&priv->lock) | __ipoib_ib_dev_flush(priv, ...) list_for_each_entry(mcast, | ipoib_mcast_dev_flush(dev = priv->dev) &priv->multicast_list, list) | ipoib_mcast_join(dev, mcast) | spin_unlock_irq(&priv->lock) | | spin_lock_irqsave(&priv->lock, flags) | list_for_each_entry_safe(mcast, tmcast, | &priv->multicast_list, list) | list_del(&mcast->list); | list_add_tail(&mcast->list, &remove_list) | spin_unlock_irqrestore(&priv->lock, flags) spin_lock_irq(&priv->lock) | | ipoib_mcast_remove_list(&remove_list) (Here, `mcast` is no longer on the | list_for_each_entry_safe(mcast, tmcast, `priv->multicast_list` and we keep | remove_list, list) spinning on the `remove_list` of | >>> wait_for_completion(&mcast->done) the other thread which is blocked | and the list is still valid on | it's stack.) Fix this by keeping the lock held and changing to GFP_ATOMIC to prevent eventual sleeps. Unfortunately we could not reproduce the lockup and confirm this fix but based on the code review I think this fix should address such lockups. crash> bc 31 PID: 747 TASK: ff1c6a1a007e8000 CPU: 31 COMMAND: "kworker/u72:2" -- [exception RIP: ipoib_mcast_join_task+0x1b1] RIP: ffffffffc0944ac1 RSP: ff646f199a8c7e00 RFLAGS: 00000002 RAX: 0000000000000000 RBX: ff1c6a1a04dc82f8 RCX: 0000000000000000 work (&priv->mcast_task{,.work}) RDX: ff1c6a192d60ac68 RSI: 0000000000000286 RDI: ff1c6a1a04dc8000 &mcast->list RBP: ff646f199a8c7e90 R8: ff1c699980019420 R9: ff1c6a1920c9a000 R10: ff646f199a8c7e00 R11: ff1c6a191a7d9800 R12: ff1c6a192d60ac00 mcast R13: ff1c6a1d82200000 R14: ff1c6a1a04dc8000 R15: ff1c6a1a04dc82d8 dev priv (&priv->lock) &priv->multicast_list (aka head) ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ff646f199a8c7e00] ipoib_mcast_join_task+0x1b1 at ffffffffc0944ac1 [ib_ipoib] #6 [ff646f199a8c7e98] process_one_work+0x1a7 at ffffffff9bf10967 crash> rx ff646f199a8c7e68 ff646f199a8c7e68: ff1c6a1a04dc82f8 <<< work = &priv->mcast_task.work crash> list -hO ipoib_dev_priv.multicast_list ff1c6a1a04dc8000 (empty) crash> ipoib_dev_priv.mcast_task.work.func,mcast_mutex.owner.counter ff1c6a1a04dc8000 mcast_task.work.func = 0xffffffffc0944910 <ipoib_mcast_join_task>, mcast_mutex.owner.counter = 0xff1c69998efec000 crash> b 8 PID: 8 TASK: ff1c69998efec000 CPU: 33 COMMAND: "kworker/u72:0" -- #3 [ff646f1980153d50] wait_for_completion+0x96 at ffffffff9c7d7646 #4 [ff646f1980153d90] ipoib_mcast_remove_list+0x56 at ffffffffc0944dc6 [ib_ipoib] #5 [ff646f1980153de8] ipoib_mcast_dev_flush+0x1a7 at ffffffffc09455a7 [ib_ipoib] #6 [ff646f1980153e58] __ipoib_ib_dev_flush+0x1a4 at ffffffffc09431a4 [ib_ipoib] #7 [ff ---truncated---
In the Linux kernel, the following vulnerability has been resolved: dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock __dma_entry_alloc_check_leak() calls into printk -> serial console output (qcom geni) and grabs port->lock under free_entries_lock spin lock, which is a reverse locking dependency chain as qcom_geni IRQ handler can call into dma-debug code and grab free_entries_lock under port->lock. Move __dma_entry_alloc_check_leak() call out of free_entries_lock scope so that we don't acquire serial console's port->lock under it. Trimmed-down lockdep splat: The existing dependency chain (in reverse order) is: -> #2 (free_entries_lock){-.-.}-{2:2}: _raw_spin_lock_irqsave+0x60/0x80 dma_entry_alloc+0x38/0x110 debug_dma_map_page+0x60/0xf8 dma_map_page_attrs+0x1e0/0x230 dma_map_single_attrs.constprop.0+0x6c/0xc8 geni_se_rx_dma_prep+0x40/0xcc qcom_geni_serial_isr+0x310/0x510 __handle_irq_event_percpu+0x110/0x244 handle_irq_event_percpu+0x20/0x54 handle_irq_event+0x50/0x88 handle_fasteoi_irq+0xa4/0xcc handle_irq_desc+0x28/0x40 generic_handle_domain_irq+0x24/0x30 gic_handle_irq+0xc4/0x148 do_interrupt_handler+0xa4/0xb0 el1_interrupt+0x34/0x64 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x64/0x68 arch_local_irq_enable+0x4/0x8 ____do_softirq+0x18/0x24 ... -> #1 (&port_lock_key){-.-.}-{2:2}: _raw_spin_lock_irqsave+0x60/0x80 qcom_geni_serial_console_write+0x184/0x1dc console_flush_all+0x344/0x454 console_unlock+0x94/0xf0 vprintk_emit+0x238/0x24c vprintk_default+0x3c/0x48 vprintk+0xb4/0xbc _printk+0x68/0x90 register_console+0x230/0x38c uart_add_one_port+0x338/0x494 qcom_geni_serial_probe+0x390/0x424 platform_probe+0x70/0xc0 really_probe+0x148/0x280 __driver_probe_device+0xfc/0x114 driver_probe_device+0x44/0x100 __device_attach_driver+0x64/0xdc bus_for_each_drv+0xb0/0xd8 __device_attach+0xe4/0x140 device_initial_probe+0x1c/0x28 bus_probe_device+0x44/0xb0 device_add+0x538/0x668 of_device_add+0x44/0x50 of_platform_device_create_pdata+0x94/0xc8 of_platform_bus_create+0x270/0x304 of_platform_populate+0xac/0xc4 devm_of_platform_populate+0x60/0xac geni_se_probe+0x154/0x160 platform_probe+0x70/0xc0 ... -> #0 (console_owner){-...}-{0:0}: __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 console_flush_all+0x330/0x454 console_unlock+0x94/0xf0 vprintk_emit+0x238/0x24c vprintk_default+0x3c/0x48 vprintk+0xb4/0xbc _printk+0x68/0x90 dma_entry_alloc+0xb4/0x110 debug_dma_map_sg+0xdc/0x2f8 __dma_map_sg_attrs+0xac/0xe4 dma_map_sgtable+0x30/0x4c get_pages+0x1d4/0x1e4 [msm] msm_gem_pin_pages_locked+0x38/0xac [msm] msm_gem_pin_vma_locked+0x58/0x88 [msm] msm_ioctl_gem_submit+0xde4/0x13ac [msm] drm_ioctl_kernel+0xe0/0x15c drm_ioctl+0x2e8/0x3f4 vfs_ioctl+0x30/0x50 ... Chain exists of: console_owner --> &port_lock_key --> free_entries_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(free_entries_lock); lock(&port_lock_key); lock(free_entries_lock); lock(console_owner); *** DEADLOCK *** Call trace: dump_backtrace+0xb4/0xf0 show_stack+0x20/0x30 dump_stack_lvl+0x60/0x84 dump_stack+0x18/0x24 print_circular_bug+0x1cc/0x234 check_noncircular+0x78/0xac __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 console_flush_all+0x330/0x454 consol ---truncated---
In the Linux kernel, the following vulnerability has been resolved: io_uring: lock overflowing for IOPOLL syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace: io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773 io_fill_cqe_req io_uring/io_uring.h:168 [inline] io_do_iopoll+0x474/0x62c io_uring/rw.c:1065 io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513 io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056 io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869 process_one_work+0x2d8/0x504 kernel/workqueue.c:2289 worker_thread+0x340/0x610 kernel/workqueue.c:2436 kthread+0x12c/0x158 kernel/kthread.c:376 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path.
In the Linux kernel, the following vulnerability has been resolved: scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard lockup call trace hangs the system. Call Trace: _raw_spin_lock_irqsave+0x32/0x40 lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc] lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc] lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc] lpfc_els_flush_cmd+0x43c/0x670 [lpfc] lpfc_els_flush_all_cmd+0x37/0x60 [lpfc] lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc] lpfc_do_work+0x1485/0x1d70 [lpfc] kthread+0x112/0x130 ret_from_fork+0x1f/0x40 Kernel panic - not syncing: Hard LOCKUP The same CPU tries to claim the phba->port_list_lock twice. Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need to take the phba->port_list_lock within lpfc_dmp_dbg().
In the Linux kernel, the following vulnerability has been resolved: cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all() syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at cpuset_attach() [1], for commit 4f7e7236435ca0ab ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that cpuset_attach() is also called from cgroup_attach_task_all(). Add cpus_read_lock() like what cgroup_procs_write_start() does.
An issue was discovered in the Linux kernel through 5.6.11. btree_gc_coalesce in drivers/md/bcache/btree.c has a deadlock if a coalescing operation fails.
In the Linux kernel, the following vulnerability has been resolved: ocfs2: Avoid touching renamed directory if parent does not change The VFS will not be locking moved directory if its parent does not change. Change ocfs2 rename code to avoid touching renamed directory if its parent does not change as without locking that can corrupt the filesystem.
In the Linux kernel, the following vulnerability has been resolved: serial: imx: fix tx statemachine deadlock When using the serial port as RS485 port, the tx statemachine is used to control the RTS pin to drive the RS485 transceiver TX_EN pin. When the TTY port is closed in the middle of a transmission (for instance during userland application crash), imx_uart_shutdown disables the interface and disables the Transmission Complete interrupt. afer that, imx_uart_stop_tx bails on an incomplete transmission, to be retriggered by the TC interrupt. This interrupt is disabled and therefore the tx statemachine never transitions out of SEND. The statemachine is in deadlock now, and the TX_EN remains low, making the interface useless. imx_uart_stop_tx now checks for incomplete transmission AND whether TC interrupts are enabled before bailing to be retriggered. This makes sure the state machine handling is reached, and is properly set to WAIT_AFTER_SEND.
In the Linux kernel, the following vulnerability has been resolved: gpio: xilinx: Convert gpio_lock to raw spinlock irq_chip functions may be called in raw spinlock context. Therefore, we must also use a raw spinlock for our own internal locking. This fixes the following lockdep splat: [ 5.349336] ============================= [ 5.353349] [ BUG: Invalid wait context ] [ 5.357361] 6.13.0-rc5+ #69 Tainted: G W [ 5.363031] ----------------------------- [ 5.367045] kworker/u17:1/44 is trying to lock: [ 5.371587] ffffff88018b02c0 (&chip->gpio_lock){....}-{3:3}, at: xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.380079] other info that might help us debug this: [ 5.385138] context-{5:5} [ 5.387762] 5 locks held by kworker/u17:1/44: [ 5.392123] #0: ffffff8800014958 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3204) [ 5.402260] #1: ffffffc082fcbdd8 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3205) [ 5.411528] #2: ffffff880172c900 (&dev->mutex){....}-{4:4}, at: __device_attach (drivers/base/dd.c:1006) [ 5.419929] #3: ffffff88039c8268 (request_class#2){+.+.}-{4:4}, at: __setup_irq (kernel/irq/internals.h:156 kernel/irq/manage.c:1596) [ 5.428331] #4: ffffff88039c80c8 (lock_class#2){....}-{2:2}, at: __setup_irq (kernel/irq/manage.c:1614) [ 5.436472] stack backtrace: [ 5.439359] CPU: 2 UID: 0 PID: 44 Comm: kworker/u17:1 Tainted: G W 6.13.0-rc5+ #69 [ 5.448690] Tainted: [W]=WARN [ 5.451656] Hardware name: xlnx,zynqmp (DT) [ 5.455845] Workqueue: events_unbound deferred_probe_work_func [ 5.461699] Call trace: [ 5.464147] show_stack+0x18/0x24 C [ 5.467821] dump_stack_lvl (lib/dump_stack.c:123) [ 5.471501] dump_stack (lib/dump_stack.c:130) [ 5.474824] __lock_acquire (kernel/locking/lockdep.c:4828 kernel/locking/lockdep.c:4898 kernel/locking/lockdep.c:5176) [ 5.478758] lock_acquire (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814) [ 5.482429] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 5.486797] xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8)) [ 5.490737] irq_enable (kernel/irq/internals.h:236 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345) [ 5.494060] __irq_startup (kernel/irq/internals.h:241 kernel/irq/chip.c:180 kernel/irq/chip.c:250) [ 5.497645] irq_startup (kernel/irq/chip.c:270) [ 5.501143] __setup_irq (kernel/irq/manage.c:1807) [ 5.504728] request_threaded_irq (kernel/irq/manage.c:2208)
The handle_stop_signal function in signal.c in Linux kernel 2.6.11 up to other versions before 2.6.13 and 2.6.12.6 allows local users to cause a denial of service (deadlock) by sending a SIGKILL to a real-time threaded process while it is performing a core dump.
In the Linux kernel, the following vulnerability has been resolved: bcache: avoid journal no-space deadlock by reserving 1 journal bucket The journal no-space deadlock was reported time to time. Such deadlock can happen in the following situation. When all journal buckets are fully filled by active jset with heavy write I/O load, the cache set registration (after a reboot) will load all active jsets and inserting them into the btree again (which is called journal replay). If a journaled bkey is inserted into a btree node and results btree node split, new journal request might be triggered. For example, the btree grows one more level after the node split, then the root node record in cache device super block will be upgrade by bch_journal_meta() from bch_btree_set_root(). But there is no space in journal buckets, the journal replay has to wait for new journal bucket to be reclaimed after at least one journal bucket replayed. This is one example that how the journal no-space deadlock happens. The solution to avoid the deadlock is to reserve 1 journal bucket in run time, and only permit the reserved journal bucket to be used during cache set registration procedure for things like journal replay. Then the journal space will never be fully filled, there is no chance for journal no-space deadlock to happen anymore. This patch adds a new member "bool do_reserve" in struct journal, it is inititalized to 0 (false) when struct journal is allocated, and set to 1 (true) by bch_journal_space_reserve() when all initialization done in run_cache_set(). In the run time when journal_reclaim() tries to allocate a new journal bucket, free_journal_buckets() is called to check whether there are enough free journal buckets to use. If there is only 1 free journal bucket and journal->do_reserve is 1 (true), the last bucket is reserved and free_journal_buckets() will return 0 to indicate no free journal bucket. Then journal_reclaim() will give up, and try next time to see whetheer there is free journal bucket to allocate. By this method, there is always 1 jouranl bucket reserved in run time. During the cache set registration, journal->do_reserve is 0 (false), so the reserved journal bucket can be used to avoid the no-space deadlock.
In the Linux kernel, the following vulnerability has been resolved: nvdimm: Fix firmware activation deadlock scenarios Lockdep reports the following deadlock scenarios for CXL root device power-management, device_prepare(), operations, and device_shutdown() operations for 'nd_region' devices: Chain exists of: &nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(system_transition_mutex); lock(&nvdimm_bus->reconfig_mutex); lock(system_transition_mutex); lock(&nvdimm_region_key); Chain exists of: &cxl_nvdimm_bridge_key --> acpi_scan_lock --> &cxl_root_key Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&cxl_root_key); lock(acpi_scan_lock); lock(&cxl_root_key); lock(&cxl_nvdimm_bridge_key); These stem from holding nvdimm_bus_lock() over hibernate_quiet_exec() which walks the entire system device topology taking device_lock() along the way. The nvdimm_bus_lock() is protecting against unregistration, multiple simultaneous ops callers, and preventing activate_show() from racing activate_store(). For the first 2, the lock is redundant. Unregistration already flushes all ops users, and sysfs already prevents multiple threads to be active in an ops handler at the same time. For the last userspace should already be waiting for its last activate_store() to complete, and does not need activate_show() to flush the write side, so this lock usage can be deleted in these attributes.
Array index overflow in the xfrm_sk_policy_insert function in xfrm_user.c in Linux kernel 2.6 allows local users to cause a denial of service (oops or deadlock) and possibly execute arbitrary code via a p->dir value that is larger than XFRM_POLICY_OUT, which is used as an index in the sock->sk_policy array.
In the Linux kernel, the following vulnerability has been resolved: RDMA/hns: Fix soft lockup under heavy CEQE load CEQEs are handled in interrupt handler currently. This may cause the CPU core staying in interrupt context too long and lead to soft lockup under heavy load. Handle CEQEs in BH workqueue and set an upper limit for the number of CEQE handled by a single call of work handler.
In the Linux kernel, the following vulnerability has been resolved: gpio: rcar: Use raw_spinlock to protect register access Use raw_spinlock in order to fix spurious messages about invalid context when spinlock debugging is enabled. The lock is only used to serialize register access. [ 4.239592] ============================= [ 4.239595] [ BUG: Invalid wait context ] [ 4.239599] 6.13.0-rc7-arm64-renesas-05496-gd088502a519f #35 Not tainted [ 4.239603] ----------------------------- [ 4.239606] kworker/u8:5/76 is trying to lock: [ 4.239609] ffff0000091898a0 (&p->lock){....}-{3:3}, at: gpio_rcar_config_interrupt_input_mode+0x34/0x164 [ 4.239641] other info that might help us debug this: [ 4.239643] context-{5:5} [ 4.239646] 5 locks held by kworker/u8:5/76: [ 4.239651] #0: ffff0000080fb148 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x190/0x62c [ 4.250180] OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'frame-master' with a value. [ 4.254094] #1: ffff80008299bd80 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1b8/0x62c [ 4.254109] #2: ffff00000920c8f8 [ 4.258345] OF: /soc/sound@ec500000/ports/port@1/endpoint: Read of boolean property 'bitclock-master' with a value. [ 4.264803] (&dev->mutex){....}-{4:4}, at: __device_attach_async_helper+0x3c/0xdc [ 4.264820] #3: ffff00000a50ca40 (request_class#2){+.+.}-{4:4}, at: __setup_irq+0xa0/0x690 [ 4.264840] #4: [ 4.268872] OF: /soc/sound@ec500000/ports/port@1/endpoint: Read of boolean property 'frame-master' with a value. [ 4.273275] ffff00000a50c8c8 (lock_class){....}-{2:2}, at: __setup_irq+0xc4/0x690 [ 4.296130] renesas_sdhi_internal_dmac ee100000.mmc: mmc1 base at 0x00000000ee100000, max clock rate 200 MHz [ 4.304082] stack backtrace: [ 4.304086] CPU: 1 UID: 0 PID: 76 Comm: kworker/u8:5 Not tainted 6.13.0-rc7-arm64-renesas-05496-gd088502a519f #35 [ 4.304092] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT) [ 4.304097] Workqueue: async async_run_entry_fn [ 4.304106] Call trace: [ 4.304110] show_stack+0x14/0x20 (C) [ 4.304122] dump_stack_lvl+0x6c/0x90 [ 4.304131] dump_stack+0x14/0x1c [ 4.304138] __lock_acquire+0xdfc/0x1584 [ 4.426274] lock_acquire+0x1c4/0x33c [ 4.429942] _raw_spin_lock_irqsave+0x5c/0x80 [ 4.434307] gpio_rcar_config_interrupt_input_mode+0x34/0x164 [ 4.440061] gpio_rcar_irq_set_type+0xd4/0xd8 [ 4.444422] __irq_set_trigger+0x5c/0x178 [ 4.448435] __setup_irq+0x2e4/0x690 [ 4.452012] request_threaded_irq+0xc4/0x190 [ 4.456285] devm_request_threaded_irq+0x7c/0xf4 [ 4.459398] ata1: link resume succeeded after 1 retries [ 4.460902] mmc_gpiod_request_cd_irq+0x68/0xe0 [ 4.470660] mmc_start_host+0x50/0xac [ 4.474327] mmc_add_host+0x80/0xe4 [ 4.477817] tmio_mmc_host_probe+0x2b0/0x440 [ 4.482094] renesas_sdhi_probe+0x488/0x6f4 [ 4.486281] renesas_sdhi_internal_dmac_probe+0x60/0x78 [ 4.491509] platform_probe+0x64/0xd8 [ 4.495178] really_probe+0xb8/0x2a8 [ 4.498756] __driver_probe_device+0x74/0x118 [ 4.503116] driver_probe_device+0x3c/0x154 [ 4.507303] __device_attach_driver+0xd4/0x160 [ 4.511750] bus_for_each_drv+0x84/0xe0 [ 4.515588] __device_attach_async_helper+0xb0/0xdc [ 4.520470] async_run_entry_fn+0x30/0xd8 [ 4.524481] process_one_work+0x210/0x62c [ 4.528494] worker_thread+0x1ac/0x340 [ 4.532245] kthread+0x10c/0x110 [ 4.535476] ret_from_fork+0x10/0x20
In the Linux kernel, the following vulnerability has been resolved: ksmbd: fix WARNING "do not call blocking ops when !TASK_RUNNING" wait_event_timeout() will set the state of the current task to TASK_UNINTERRUPTIBLE, before doing the condition check. This means that ksmbd_durable_scavenger_alive() will try to acquire the mutex while already in a sleeping state. The scheduler warns us by giving the following warning: do not call blocking ops when !TASK_RUNNING; state=2 set at [<0000000061515a6f>] prepare_to_wait_event+0x9f/0x6c0 WARNING: CPU: 2 PID: 4147 at kernel/sched/core.c:10099 __might_sleep+0x12f/0x160 mutex lock is not needed in ksmbd_durable_scavenger_alive().
In the Linux kernel, the following vulnerability has been resolved: nfs_common: must not hold RCU while calling nfsd_file_put_local Move holding the RCU from nfs_to_nfsd_file_put_local to nfs_to_nfsd_net_put. It is the call to nfs_to->nfsd_serv_put that requires the RCU anyway (the puts for nfsd_file and netns were combined to avoid an extra indirect reference but that micro-optimization isn't possible now). This fixes xfstests generic/013 and it triggering: "Voluntary context switch within RCU read-side critical section!" [ 143.545738] Call Trace: [ 143.546206] <TASK> [ 143.546625] ? show_regs+0x6d/0x80 [ 143.547267] ? __warn+0x91/0x140 [ 143.547951] ? rcu_note_context_switch+0x496/0x5d0 [ 143.548856] ? report_bug+0x193/0x1a0 [ 143.549557] ? handle_bug+0x63/0xa0 [ 143.550214] ? exc_invalid_op+0x1d/0x80 [ 143.550938] ? asm_exc_invalid_op+0x1f/0x30 [ 143.551736] ? rcu_note_context_switch+0x496/0x5d0 [ 143.552634] ? wakeup_preempt+0x62/0x70 [ 143.553358] __schedule+0xaa/0x1380 [ 143.554025] ? _raw_spin_unlock_irqrestore+0x12/0x40 [ 143.554958] ? try_to_wake_up+0x1fe/0x6b0 [ 143.555715] ? wake_up_process+0x19/0x20 [ 143.556452] schedule+0x2e/0x120 [ 143.557066] schedule_preempt_disabled+0x19/0x30 [ 143.557933] rwsem_down_read_slowpath+0x24d/0x4a0 [ 143.558818] ? xfs_efi_item_format+0x50/0xc0 [xfs] [ 143.559894] down_read+0x4e/0xb0 [ 143.560519] xlog_cil_commit+0x1b2/0xbc0 [xfs] [ 143.561460] ? _raw_spin_unlock+0x12/0x30 [ 143.562212] ? xfs_inode_item_precommit+0xc7/0x220 [xfs] [ 143.563309] ? xfs_trans_run_precommits+0x69/0xd0 [xfs] [ 143.564394] __xfs_trans_commit+0xb5/0x330 [xfs] [ 143.565367] xfs_trans_roll+0x48/0xc0 [xfs] [ 143.566262] xfs_defer_trans_roll+0x57/0x100 [xfs] [ 143.567278] xfs_defer_finish_noroll+0x27a/0x490 [xfs] [ 143.568342] xfs_defer_finish+0x1a/0x80 [xfs] [ 143.569267] xfs_bunmapi_range+0x4d/0xb0 [xfs] [ 143.570208] xfs_itruncate_extents_flags+0x13d/0x230 [xfs] [ 143.571353] xfs_free_eofblocks+0x12e/0x190 [xfs] [ 143.572359] xfs_file_release+0x12d/0x140 [xfs] [ 143.573324] __fput+0xe8/0x2d0 [ 143.573922] __fput_sync+0x1d/0x30 [ 143.574574] nfsd_filp_close+0x33/0x60 [nfsd] [ 143.575430] nfsd_file_free+0x96/0x150 [nfsd] [ 143.576274] nfsd_file_put+0xf7/0x1a0 [nfsd] [ 143.577104] nfsd_file_put_local+0x18/0x30 [nfsd] [ 143.578070] nfs_close_local_fh+0x101/0x110 [nfs_localio] [ 143.579079] __put_nfs_open_context+0xc9/0x180 [nfs] [ 143.580031] nfs_file_clear_open_context+0x4a/0x60 [nfs] [ 143.581038] nfs_file_release+0x3e/0x60 [nfs] [ 143.581879] __fput+0xe8/0x2d0 [ 143.582464] __fput_sync+0x1d/0x30 [ 143.583108] __x64_sys_close+0x41/0x80 [ 143.583823] x64_sys_call+0x189a/0x20d0 [ 143.584552] do_syscall_64+0x64/0x170 [ 143.585240] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 143.586185] RIP: 0033:0x7f3c5153efd7
In the Linux kernel, the following vulnerability has been resolved: drm/xe/guc_submit: add missing locking in wedged_fini Any non-wedged queue can have a zero refcount here and can be running concurrently with an async queue destroy, therefore dereferencing the queue ptr to check wedge status after the lookup can trigger UAF if queue is not wedged. Fix this by keeping the submission_state lock held around the check to postpone the free and make the check safe, before dropping again around the put() to avoid the deadlock. (cherry picked from commit d28af0b6b9580b9f90c265a7da0315b0ad20bbfd)
In the Linux kernel, the following vulnerability has been resolved: ALSA: timer: Don't take register_mutex with copy_from/to_user() The infamous mmap_lock taken in copy_from/to_user() can be often problematic when it's called inside another mutex, as they might lead to deadlocks. In the case of ALSA timer code, the bad pattern is with guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which was mistakenly introduced at converting to guard(), and it had been carefully worked around in the past. This patch fixes those pieces simply by moving copy_from/to_user() out of the register mutex lock again.
In the Linux kernel, the following vulnerability has been resolved: USB: gadget: f_midi: f_midi_complete to call queue_work When using USB MIDI, a lock is attempted to be acquired twice through a re-entrant call to f_midi_transmit, causing a deadlock. Fix it by using queue_work() to schedule the inner f_midi_transmit() via a high priority work queue from the completion handler.
In the Linux kernel, the following vulnerability has been resolved: drm/imagination: avoid deadlock on fence release Do scheduler queue fence release processing on a workqueue, rather than in the release function itself. Fixes deadlock issues such as the following: [ 607.400437] ============================================ [ 607.405755] WARNING: possible recursive locking detected [ 607.415500] -------------------------------------------- [ 607.420817] weston:zfq0/24149 is trying to acquire lock: [ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr] [ 607.436728] but task is already holding lock: [ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554 [ 607.451727] other info that might help us debug this: [ 607.458245] Possible unsafe locking scenario: [ 607.464155] CPU0 [ 607.466601] ---- [ 607.469044] lock(reservation_ww_class_mutex); [ 607.473584] lock(reservation_ww_class_mutex); [ 607.478114] *** DEADLOCK ***
In the Linux kernel, the following vulnerability has been resolved: cgroup/cpuset: remove kernfs active break A warning was found: WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828 CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0 RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202 RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000 RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04 RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180 R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08 R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0 FS: 00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: kernfs_drain+0x15e/0x2f0 __kernfs_remove+0x165/0x300 kernfs_remove_by_name_ns+0x7b/0xc0 cgroup_rm_file+0x154/0x1c0 cgroup_addrm_files+0x1c2/0x1f0 css_clear_dir+0x77/0x110 kill_css+0x4c/0x1b0 cgroup_destroy_locked+0x194/0x380 cgroup_rmdir+0x2a/0x140 It can be explained by: rmdir echo 1 > cpuset.cpus kernfs_fop_write_iter // active=0 cgroup_rm_file kernfs_remove_by_name_ns kernfs_get_active // active=1 __kernfs_remove // active=0x80000002 kernfs_drain cpuset_write_resmask wait_event //waiting (active == 0x80000001) kernfs_break_active_protection // active = 0x80000001 // continue kernfs_unbreak_active_protection // active = 0x80000002 ... kernfs_should_drain_open_files // warning occurs kernfs_put_active This warning is caused by 'kernfs_break_active_protection' when it is writing to cpuset.cpus, and the cgroup is removed concurrently. The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change involves calling flush_work(), which can create a multiple processes circular locking dependency that involve cgroup_mutex, potentially leading to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()") added 'kernfs_break_active_protection' in the cpuset_write_resmask. This could lead to this warning. After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug processing synchronous"), the cpuset_write_resmask no longer needs to wait the hotplug to finish, which means that concurrent hotplug and cpuset operations are no longer possible. Therefore, the deadlock doesn't exist anymore and it does not have to 'break active protection' now. To fix this warning, just remove kernfs_break_active_protection operation in the 'cpuset_write_resmask'.
In the Linux kernel, the following vulnerability has been resolved: hwrng: core - Fix page fault dead lock on mmap-ed hwrng There is a dead-lock in the hwrng device read path. This triggers when the user reads from /dev/hwrng into memory also mmap-ed from /dev/hwrng. The resulting page fault triggers a recursive read which then dead-locks. Fix this by using a stack buffer when calling copy_to_user.
In the Linux kernel, the following vulnerability has been resolved: net: rose: lock the socket in rose_bind() syzbot reported a soft lockup in rose_loopback_timer(), with a repro calling bind() from multiple threads. rose_bind() must lock the socket to avoid this issue.
In the Linux kernel, the following vulnerability has been resolved: drm/i915/gt: Use spin_lock_irqsave() in interruptible context spin_lock/unlock() functions used in interrupt contexts could result in a deadlock, as seen in GitLab issue #13399, which occurs when interrupt comes in while holding a lock. Try to remedy the problem by saving irq state before spin lock acquisition. v2: add irqs' state save/restore calls to all locks/unlocks in signal_irq_work() execution (Maciej) v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead of other lock/unlock calls and add Fixes and Cc tags (Tvrtko); change title and commit message (cherry picked from commit c088387ddd6482b40f21ccf23db1125e8fa4af7e)
In the Linux kernel, the following vulnerability has been resolved: tty: xilinx_uartps: split sysrq handling lockdep detects the following circular locking dependency: CPU 0 CPU 1 ========================== ============================ cdns_uart_isr() printk() uart_port_lock(port) console_lock() cdns_uart_console_write() if (!port->sysrq) uart_port_lock(port) uart_handle_break() port->sysrq = ... uart_handle_sysrq_char() printk() console_lock() The fixed commit attempts to avoid this situation by only taking the port lock in cdns_uart_console_write if port->sysrq unset. However, if (as shown above) cdns_uart_console_write runs before port->sysrq is set, then it will try to take the port lock anyway. This may result in a deadlock. Fix this by splitting sysrq handling into two parts. We use the prepare helper under the port lock and defer handling until we release the lock.
In the Linux kernel, the following vulnerability has been resolved: afs: Fix merge preference rule failure condition syzbot reported a lock held when returning to userspace[1]. This is because if argc is less than 0 and the function returns directly, the held inode lock is not released. Fix this by store the error in ret and jump to done to clean up instead of returning directly. [dh: Modified Lizhi Xu's original patch to make it honour the error code from afs_split_string()] [1] WARNING: lock held when returning to user space! 6.13.0-rc3-syzkaller-00209-g499551201b5f #0 Not tainted ------------------------------------------------ syz-executor133/5823 is leaving the kernel with locks still held! 1 lock held by syz-executor133/5823: #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: inode_lock include/linux/fs.h:818 [inline] #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: afs_proc_addr_prefs_write+0x2bb/0x14e0 fs/afs/addr_prefs.c:388
In the Linux kernel, the following vulnerability has been resolved: bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock There are multiple places from where the recovery work gets scheduled asynchronously. Also, there are multiple places where the caller waits synchronously for the recovery to be completed. One such place is during the PM shutdown() callback. If the device is not alive during recovery_work, it will try to reset the device using pci_reset_function(). This function internally will take the device_lock() first before resetting the device. By this time, if the lock has already been acquired, then recovery_work will get stalled while waiting for the lock. And if the lock was already acquired by the caller which waits for the recovery_work to be completed, it will lead to deadlock. This is what happened on the X1E80100 CRD device when the device died before shutdown() callback. Driver core calls the driver's shutdown() callback while holding the device_lock() leading to deadlock. And this deadlock scenario can occur on other paths as well, like during the PM suspend() callback, where the driver core would hold the device_lock() before calling driver's suspend() callback. And if the recovery_work was already started, it could lead to deadlock. This is also observed on the X1E80100 CRD. So to fix both issues, use pci_try_reset_function() in recovery_work. This function first checks for the availability of the device_lock() before trying to reset the device. If the lock is available, it will acquire it and reset the device. Otherwise, it will return -EAGAIN. If that happens, recovery_work will fail with the error message "Recovery failed" as not much could be done.
In the Linux kernel, the following vulnerability has been resolved: NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback Add PF_KCOMPACTD flag and current_is_kcompactd() helper to check for it so nfs_release_folio() can skip calling nfs_wb_folio() from kcompactd. Otherwise NFS can deadlock waiting for kcompactd enduced writeback which recurses back to NFS (which triggers writeback to NFSD via NFS loopback mount on the same host, NFSD blocks waiting for XFS's call to __filemap_get_folio): 6070.550357] INFO: task kcompactd0:58 blocked for more than 4435 seconds. {--- [58] "kcompactd0" [<0>] folio_wait_bit+0xe8/0x200 [<0>] folio_wait_writeback+0x2b/0x80 [<0>] nfs_wb_folio+0x80/0x1b0 [nfs] [<0>] nfs_release_folio+0x68/0x130 [nfs] [<0>] split_huge_page_to_list_to_order+0x362/0x840 [<0>] migrate_pages_batch+0x43d/0xb90 [<0>] migrate_pages_sync+0x9a/0x240 [<0>] migrate_pages+0x93c/0x9f0 [<0>] compact_zone+0x8e2/0x1030 [<0>] compact_node+0xdb/0x120 [<0>] kcompactd+0x121/0x2e0 [<0>] kthread+0xcf/0x100 [<0>] ret_from_fork+0x31/0x40 [<0>] ret_from_fork_asm+0x1a/0x30 ---} [akpm@linux-foundation.org: fix build]
In the Linux kernel, the following vulnerability has been resolved: NFSv4: Fix a deadlock when recovering state on a sillyrenamed file If the file is sillyrenamed, and slated for delete on close, it is possible for a server reboot to triggeer an open reclaim, with can again race with the application call to close(). When that happens, the call to put_nfs_open_context() can trigger a synchronous delegreturn call which deadlocks because it is not marked as privileged. Instead, ensure that the call to nfs4_inode_return_delegation_on_close() catches the delegreturn, and schedules it asynchronously.
In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel Attempt to enable IPsec packet offload in tunnel mode in debug kernel generates the following kernel panic, which is happening due to two issues: 1. In SA add section, the should be _bh() variant when marking SA mode. 2. There is not needed flush_workqueue in SA delete routine. It is not needed as at this stage as it is removed from SADB and the running work will be canceled later in SA free. ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 6.12.0+ #4 Not tainted ----------------------------------------------------- charon/1337 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: ffff88810f365020 (&xa->xa_lock#24){+.+.}-{3:3}, at: mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core] and this task is already holding: ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30 which would create a new lock dependency: (&x->lock){+.-.}-{3:3} -> (&xa->xa_lock#24){+.+.}-{3:3} but this new dependency connects a SOFTIRQ-irq-safe lock: (&x->lock){+.-.}-{3:3} ... which became SOFTIRQ-irq-safe at: lock_acquire+0x1be/0x520 _raw_spin_lock_bh+0x34/0x40 xfrm_timer_handler+0x91/0xd70 __hrtimer_run_queues+0x1dd/0xa60 hrtimer_run_softirq+0x146/0x2e0 handle_softirqs+0x266/0x860 irq_exit_rcu+0x115/0x1a0 sysvec_apic_timer_interrupt+0x6e/0x90 asm_sysvec_apic_timer_interrupt+0x16/0x20 default_idle+0x13/0x20 default_idle_call+0x67/0xa0 do_idle+0x2da/0x320 cpu_startup_entry+0x50/0x60 start_secondary+0x213/0x2a0 common_startup_64+0x129/0x138 to a SOFTIRQ-irq-unsafe lock: (&xa->xa_lock#24){+.+.}-{3:3} ... which became SOFTIRQ-irq-unsafe at: ... lock_acquire+0x1be/0x520 _raw_spin_lock+0x2c/0x40 xa_set_mark+0x70/0x110 mlx5e_xfrm_add_state+0xe48/0x2290 [mlx5_core] xfrm_dev_state_add+0x3bb/0xd70 xfrm_add_sa+0x2451/0x4a90 xfrm_user_rcv_msg+0x493/0x880 netlink_rcv_skb+0x12e/0x380 xfrm_netlink_rcv+0x6d/0x90 netlink_unicast+0x42f/0x740 netlink_sendmsg+0x745/0xbe0 __sock_sendmsg+0xc5/0x190 __sys_sendto+0x1fe/0x2c0 __x64_sys_sendto+0xdc/0x1b0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&xa->xa_lock#24); local_irq_disable(); lock(&x->lock); lock(&xa->xa_lock#24); <Interrupt> lock(&x->lock); *** DEADLOCK *** 2 locks held by charon/1337: #0: ffffffff87f8f858 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x5e/0x90 #1: ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30 the dependencies between SOFTIRQ-irq-safe lock and the holding lock: -> (&x->lock){+.-.}-{3:3} ops: 29 { HARDIRQ-ON-W at: lock_acquire+0x1be/0x520 _raw_spin_lock_bh+0x34/0x40 xfrm_alloc_spi+0xc0/0xe60 xfrm_alloc_userspi+0x5f6/0xbc0 xfrm_user_rcv_msg+0x493/0x880 netlink_rcv_skb+0x12e/0x380 xfrm_netlink_rcv+0x6d/0x90 netlink_unicast+0x42f/0x740 netlink_sendmsg+0x745/0xbe0 __sock_sendmsg+0xc5/0x190 __sys_sendto+0x1fe/0x2c0 __x64_sys_sendto+0xdc/0x1b0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 IN-SOFTIRQ-W at: lock_acquire+0x1be/0x520 _raw_spin_lock_bh+0x34/0x40 xfrm_timer_handler+0x91/0xd70 __hrtimer_run_queues+0x1dd/0xa60 ---truncated---
In the Linux kernel before 4.16.4, a double-locking error in drivers/usb/dwc3/gadget.c may potentially cause a deadlock with f_hid.
In the Linux kernel, the following vulnerability has been resolved: mm/kmemleak: avoid scanning potential huge holes When using devm_request_free_mem_region() and devm_memremap_pages() to add ZONE_DEVICE memory, if requested free mem region's end pfn were huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see move_pfn_range_to_zone()). Thus it creates a huge hole between node_start_pfn() and node_end_pfn(). We found on some AMD APUs, amdkfd requested such a free mem region and created a huge hole. In such a case, following code snippet was just doing busy test_bit() looping on the huge hole. for (pfn = start_pfn; pfn < end_pfn; pfn++) { struct page *page = pfn_to_online_page(pfn); if (!page) continue; ... } So we got a soft lockup: watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221] CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1 RIP: 0010:pfn_to_online_page+0x5/0xd0 Call Trace: ? kmemleak_scan+0x16a/0x440 kmemleak_write+0x306/0x3a0 ? common_file_perm+0x72/0x170 full_proxy_write+0x5c/0x90 vfs_write+0xb9/0x260 ksys_write+0x67/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae I did some tests with the patch. (1) amdgpu module unloaded before the patch: real 0m0.976s user 0m0.000s sys 0m0.968s after the patch: real 0m0.981s user 0m0.000s sys 0m0.973s (2) amdgpu module loaded before the patch: real 0m35.365s user 0m0.000s sys 0m35.354s after the patch: real 0m1.049s user 0m0.000s sys 0m1.042s
In the Linux kernel, the following vulnerability has been resolved: xen: privcmd: Switch from mutex to spinlock for irqfds irqfd_wakeup() gets EPOLLHUP, when it is called by eventfd_release() by way of wake_up_poll(&ctx->wqh, EPOLLHUP), which gets called under spin_lock_irqsave(). We can't use a mutex here as it will lead to a deadlock. Fix it by switching over to a spin lock.
In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: Take state lock during tx timeout reporter mlx5e_safe_reopen_channels() requires the state lock taken. The referenced changed in the Fixes tag removed the lock to fix another issue. This patch adds it back but at a later point (when calling mlx5e_safe_reopen_channels()) to avoid the deadlock referenced in the Fixes tag.
The inotify_read function in the Linux kernel 2.6.27 to 2.6.27.13, 2.6.28 to 2.6.28.2, and 2.6.29-rc3 allows local users to cause a denial of service (OOPS) via a read with an invalid address to an inotify instance, which causes the device's event list mutex to be unlocked twice and prevents proper synchronization of a data structure for the inotify instance.
In the Linux kernel, the following vulnerability has been resolved: i2c: tegra: Do not mark ACPI devices as irq safe On ACPI machines, the tegra i2c module encounters an issue due to a mutex being called inside a spinlock. This leads to the following bug: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 ... Call trace: __might_sleep __mutex_lock_common mutex_lock_nested acpi_subsys_runtime_resume rpm_resume tegra_i2c_xfer The problem arises because during __pm_runtime_resume(), the spinlock &dev->power.lock is acquired before rpm_resume() is called. Later, rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on mutexes, triggering the error. To address this issue, devices on ACPI are now marked as not IRQ-safe, considering the dependency of acpi_subsys_runtime_resume() on mutexes.
In the Linux kernel, the following vulnerability has been resolved: virtio-blk: don't keep queue frozen during system suspend Commit 4ce6e2db00de ("virtio-blk: Ensure no requests in virtqueues before deleting vqs.") replaces queue quiesce with queue freeze in virtio-blk's PM callbacks. And the motivation is to drain inflight IOs before suspending. block layer's queue freeze looks very handy, but it is also easy to cause deadlock, such as, any attempt to call into bio_queue_enter() may run into deadlock if the queue is frozen in current context. There are all kinds of ->suspend() called in suspend context, so keeping queue frozen in the whole suspend context isn't one good idea. And Marek reported lockdep warning[1] caused by virtio-blk's freeze queue in virtblk_freeze(). [1] https://lore.kernel.org/linux-block/ca16370e-d646-4eee-b9cc-87277c89c43c@samsung.com/ Given the motivation is to drain in-flight IOs, it can be done by calling freeze & unfreeze, meantime restore to previous behavior by keeping queue quiesced during suspend.
In the Linux kernel, the following vulnerability has been resolved: bpf: bpf_local_storage: Always use bpf_mem_alloc in PREEMPT_RT In PREEMPT_RT, kmalloc(GFP_ATOMIC) is still not safe in non preemptible context. bpf_mem_alloc must be used in PREEMPT_RT. This patch is to enforce bpf_mem_alloc in the bpf_local_storage when CONFIG_PREEMPT_RT is enabled. [ 35.118559] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 35.118566] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1832, name: test_progs [ 35.118569] preempt_count: 1, expected: 0 [ 35.118571] RCU nest depth: 1, expected: 1 [ 35.118577] INFO: lockdep is turned off. ... [ 35.118647] __might_resched+0x433/0x5b0 [ 35.118677] rt_spin_lock+0xc3/0x290 [ 35.118700] ___slab_alloc+0x72/0xc40 [ 35.118723] __kmalloc_noprof+0x13f/0x4e0 [ 35.118732] bpf_map_kzalloc+0xe5/0x220 [ 35.118740] bpf_selem_alloc+0x1d2/0x7b0 [ 35.118755] bpf_local_storage_update+0x2fa/0x8b0 [ 35.118784] bpf_sk_storage_get_tracing+0x15a/0x1d0 [ 35.118791] bpf_prog_9a118d86fca78ebb_trace_inet_sock_set_state+0x44/0x66 [ 35.118795] bpf_trace_run3+0x222/0x400 [ 35.118820] __bpf_trace_inet_sock_set_state+0x11/0x20 [ 35.118824] trace_inet_sock_set_state+0x112/0x130 [ 35.118830] inet_sk_state_store+0x41/0x90 [ 35.118836] tcp_set_state+0x3b3/0x640 There is no need to adjust the gfp_flags passing to the bpf_mem_cache_alloc_flags() which only honors the GFP_KERNEL. The verifier has ensured GFP_KERNEL is passed only in sleepable context. It has been an old issue since the first introduction of the bpf_local_storage ~5 years ago, so this patch targets the bpf-next. bpf_mem_alloc is needed to solve it, so the Fixes tag is set to the commit when bpf_mem_alloc was first used in the bpf_local_storage.
In the Linux kernel, the following vulnerability has been resolved: team: prevent adding a device which is already a team device lower Prevent adding a device which is already a team device lower, e.g. adding veth0 if vlan1 was already added and veth0 is a lower of vlan1. This is not useful in practice and can lead to recursive locking: $ ip link add veth0 type veth peer name veth1 $ ip link set veth0 up $ ip link set veth1 up $ ip link add link veth0 name veth0.1 type vlan protocol 802.1Q id 1 $ ip link add team0 type team $ ip link set veth0.1 down $ ip link set veth0.1 master team0 team0: Port device veth0.1 added $ ip link set veth0 down $ ip link set veth0 master team0 ============================================ WARNING: possible recursive locking detected 6.13.0-rc2-virtme-00441-ga14a429069bb #46 Not tainted -------------------------------------------- ip/7684 is trying to acquire lock: ffff888016848e00 (team->team_lock_key){+.+.}-{4:4}, at: team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) but task is already holding lock: ffff888016848e00 (team->team_lock_key){+.+.}-{4:4}, at: team_add_slave (drivers/net/team/team_core.c:1147 drivers/net/team/team_core.c:1977) other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(team->team_lock_key); lock(team->team_lock_key); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by ip/7684: stack backtrace: CPU: 3 UID: 0 PID: 7684 Comm: ip Not tainted 6.13.0-rc2-virtme-00441-ga14a429069bb #46 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:122) print_deadlock_bug.cold (kernel/locking/lockdep.c:3040) __lock_acquire (kernel/locking/lockdep.c:3893 kernel/locking/lockdep.c:5226) ? netlink_broadcast_filtered (net/netlink/af_netlink.c:1548) lock_acquire.part.0 (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? trace_lock_acquire (./include/trace/events/lock.h:24 (discriminator 2)) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? lock_acquire (kernel/locking/lockdep.c:5822) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:735) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? fib_sync_up (net/ipv4/fib_semantics.c:2167) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) notifier_call_chain (kernel/notifier.c:85) call_netdevice_notifiers_info (net/core/dev.c:1996) __dev_notify_flags (net/core/dev.c:8993) ? __dev_change_flags (net/core/dev.c:8975) dev_change_flags (net/core/dev.c:9027) vlan_device_event (net/8021q/vlan.c:85 net/8021q/vlan.c:470) ? br_device_event (net/bridge/br.c:143) notifier_call_chain (kernel/notifier.c:85) call_netdevice_notifiers_info (net/core/dev.c:1996) dev_open (net/core/dev.c:1519 net/core/dev.c:1505) team_add_slave (drivers/net/team/team_core.c:1219 drivers/net/team/team_core.c:1977) ? __pfx_team_add_slave (drivers/net/team/team_core.c:1972) do_set_master (net/core/rtnetlink.c:2917) do_setlink.isra.0 (net/core/rtnetlink.c:3117)
In the Linux kernel, the following vulnerability has been resolved: media: uvcvideo: Fix deadlock during uvc_probe If uvc_probe() fails, it can end up calling uvc_status_unregister() before uvc_status_init() is called. Fix this by checking if dev->status is NULL or not in uvc_status_unregister().
In the Linux kernel, the following vulnerability has been resolved: scsi: megaraid_sas: Fix for a potential deadlock This fixes a 'possible circular locking dependency detected' warning CPU0 CPU1 ---- ---- lock(&instance->reset_mutex); lock(&shost->scan_mutex); lock(&instance->reset_mutex); lock(&shost->scan_mutex); Fix this by temporarily releasing the reset_mutex.
In the Linux kernel, the following vulnerability has been resolved: usb: musb: Fix hardware lockup on first Rx endpoint request There is a possibility that a request's callback could be invoked from usb_ep_queue() (call trace below, supplemented with missing calls): req->complete from usb_gadget_giveback_request (drivers/usb/gadget/udc/core.c:999) usb_gadget_giveback_request from musb_g_giveback (drivers/usb/musb/musb_gadget.c:147) musb_g_giveback from rxstate (drivers/usb/musb/musb_gadget.c:784) rxstate from musb_ep_restart (drivers/usb/musb/musb_gadget.c:1169) musb_ep_restart from musb_ep_restart_resume_work (drivers/usb/musb/musb_gadget.c:1176) musb_ep_restart_resume_work from musb_queue_resume_work (drivers/usb/musb/musb_core.c:2279) musb_queue_resume_work from musb_gadget_queue (drivers/usb/musb/musb_gadget.c:1241) musb_gadget_queue from usb_ep_queue (drivers/usb/gadget/udc/core.c:300) According to the docstring of usb_ep_queue(), this should not happen: "Note that @req's ->complete() callback must never be called from within usb_ep_queue() as that can create deadlock situations." In fact, a hardware lockup might occur in the following sequence: 1. The gadget is initialized using musb_gadget_enable(). 2. Meanwhile, a packet arrives, and the RXPKTRDY flag is set, raising an interrupt. 3. If IRQs are enabled, the interrupt is handled, but musb_g_rx() finds an empty queue (next_request() returns NULL). The interrupt flag has already been cleared by the glue layer handler, but the RXPKTRDY flag remains set. 4. The first request is enqueued using usb_ep_queue(), leading to the call of req->complete(), as shown in the call trace above. 5. If the callback enables IRQs and another packet is waiting, step (3) repeats. The request queue is empty because usb_g_giveback() removes the request before invoking the callback. 6. The endpoint remains locked up, as the interrupt triggered by hardware setting the RXPKTRDY flag has been handled, but the flag itself remains set. For this scenario to occur, it is only necessary for IRQs to be enabled at some point during the complete callback. This happens with the USB Ethernet gadget, whose rx_complete() callback calls netif_rx(). If called in the task context, netif_rx() disables the bottom halves (BHs). When the BHs are re-enabled, IRQs are also enabled to allow soft IRQs to be processed. The gadget itself is initialized at module load (or at boot if built-in), but the first request is enqueued when the network interface is brought up, triggering rx_complete() in the task context via ioctl(). If a packet arrives while the interface is down, it can prevent the interface from receiving any further packets from the USB host. The situation is quite complicated with many parties involved. This particular issue can be resolved in several possible ways: 1. Ensure that callbacks never enable IRQs. This would be difficult to enforce, as discovering how netif_rx() interacts with interrupts was already quite challenging and u_ether is not the only function driver. Similar "bugs" could be hidden in other drivers as well. 2. Disable MUSB interrupts in musb_g_giveback() before calling the callback and re-enable them afterwars (by calling musb_{dis,en}able_interrupts(), for example). This would ensure that MUSB interrupts are not handled during the callback, even if IRQs are enabled. In fact, it would allow IRQs to be enabled when releasing the lock. However, this feels like an inelegant hack. 3. Modify the interrupt handler to clear the RXPKTRDY flag if the request queue is empty. While this approach also feels like a hack, it wastes CPU time by attempting to handle incoming packets when the software is not ready to process them. 4. Flush the Rx FIFO instead of calling rxstate() in musb_ep_restart(). This ensures that the hardware can receive packets when there is at least one request in the queue. Once I ---truncated---
In the Linux kernel, the following vulnerability has been resolved: bpf: fix recursive lock when verdict program return SK_PASS When the stream_verdict program returns SK_PASS, it places the received skb into its own receive queue, but a recursive lock eventually occurs, leading to an operating system deadlock. This issue has been present since v6.9. ''' sk_psock_strp_data_ready write_lock_bh(&sk->sk_callback_lock) strp_data_ready strp_read_sock read_sock -> tcp_read_sock strp_recv cb.rcv_msg -> sk_psock_strp_read # now stream_verdict return SK_PASS without peer sock assign __SK_PASS = sk_psock_map_verd(SK_PASS, NULL) sk_psock_verdict_apply sk_psock_skb_ingress_self sk_psock_skb_ingress_enqueue sk_psock_data_ready read_lock_bh(&sk->sk_callback_lock) <= dead lock ''' This topic has been discussed before, but it has not been fixed. Previous discussion: https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch
fs/splice.c in the splice subsystem in the Linux kernel before 2.6.22.2 does not properly handle a failure of the add_to_page_cache_lru function, and subsequently attempts to unlock a page that was not locked, which allows local users to cause a denial of service (kernel BUG and system crash), as demonstrated by the fio I/O tool.
In the Linux kernel, the following vulnerability has been resolved: mptcp: init: protect sched with rcu_read_lock Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ #11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning.
In the Linux kernel, the following vulnerability has been resolved: tracing/timerlat: Drop interface_lock in stop_kthread() stop_kthread() is the offline callback for "trace/osnoise:online", since commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()"), the following ABBA deadlock scenario is introduced: T1 | T2 [BP] | T3 [AP] osnoise_hotplug_workfn() | work_for_cpu_fn() | cpuhp_thread_fun() | _cpu_down() | osnoise_cpu_die() mutex_lock(&interface_lock) | | stop_kthread() | cpus_write_lock() | mutex_lock(&interface_lock) cpus_read_lock() | cpuhp_kick_ap() | As the interface_lock here in just for protecting the "kthread" field of the osn_var, use xchg() instead to fix this issue. Also use for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take cpu_read_lock() again.
In the Linux kernel, the following vulnerability has been resolved: btrfs: don't take dev_replace rwsem on task already holding it Running fstests btrfs/011 with MKFS_OPTIONS="-O rst" to force the usage of the RAID stripe-tree, we get the following splat from lockdep: BTRFS info (device sdd): dev_replace from /dev/sdd (devid 1) to /dev/sdb started ============================================ WARNING: possible recursive locking detected 6.11.0-rc3-btrfs-for-next #599 Not tainted -------------------------------------------- btrfs/2326 is trying to acquire lock: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 but task is already holding lock: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&fs_info->dev_replace.rwsem); lock(&fs_info->dev_replace.rwsem); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by btrfs/2326: #0: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 stack backtrace: CPU: 1 UID: 0 PID: 2326 Comm: btrfs Not tainted 6.11.0-rc3-btrfs-for-next #599 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x5b/0x80 __lock_acquire+0x2798/0x69d0 ? __pfx___lock_acquire+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 lock_acquire+0x19d/0x4a0 ? btrfs_map_block+0x39f/0x2250 ? __pfx_lock_acquire+0x10/0x10 ? find_held_lock+0x2d/0x110 ? lock_is_held_type+0x8f/0x100 down_read+0x8e/0x440 ? btrfs_map_block+0x39f/0x2250 ? __pfx_down_read+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 btrfs_map_block+0x39f/0x2250 ? btrfs_dev_replace_by_ioctl+0xd69/0x1d00 ? btrfs_bio_counter_inc_blocked+0xd9/0x2e0 ? __kasan_slab_alloc+0x6e/0x70 ? __pfx_btrfs_map_block+0x10/0x10 ? __pfx_btrfs_bio_counter_inc_blocked+0x10/0x10 ? kmem_cache_alloc_noprof+0x1f2/0x300 ? mempool_alloc_noprof+0xed/0x2b0 btrfs_submit_chunk+0x28d/0x17e0 ? __pfx_btrfs_submit_chunk+0x10/0x10 ? bvec_alloc+0xd7/0x1b0 ? bio_add_folio+0x171/0x270 ? __pfx_bio_add_folio+0x10/0x10 ? __kasan_check_read+0x20/0x20 btrfs_submit_bio+0x37/0x80 read_extent_buffer_pages+0x3df/0x6c0 btrfs_read_extent_buffer+0x13e/0x5f0 read_tree_block+0x81/0xe0 read_block_for_search+0x4bd/0x7a0 ? __pfx_read_block_for_search+0x10/0x10 btrfs_search_slot+0x78d/0x2720 ? __pfx_btrfs_search_slot+0x10/0x10 ? lock_is_held_type+0x8f/0x100 ? kasan_save_track+0x14/0x30 ? __kasan_slab_alloc+0x6e/0x70 ? kmem_cache_alloc_noprof+0x1f2/0x300 btrfs_get_raid_extent_offset+0x181/0x820 ? __pfx_lock_acquire+0x10/0x10 ? __pfx_btrfs_get_raid_extent_offset+0x10/0x10 ? down_read+0x194/0x440 ? __pfx_down_read+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 btrfs_map_block+0x5b5/0x2250 ? __pfx_btrfs_map_block+0x10/0x10 scrub_submit_initial_read+0x8fe/0x11b0 ? __pfx_scrub_submit_initial_read+0x10/0x10 submit_initial_group_read+0x161/0x3a0 ? lock_release+0x20e/0x710 ? __pfx_submit_initial_group_read+0x10/0x10 ? __pfx_lock_release+0x10/0x10 scrub_simple_mirror.isra.0+0x3eb/0x580 scrub_stripe+0xe4d/0x1440 ? lock_release+0x20e/0x710 ? __pfx_scrub_stripe+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 scrub_chunk+0x257/0x4a0 scrub_enumerate_chunks+0x64c/0xf70 ? __mutex_unlock_slowpath+0x147/0x5f0 ? __pfx_scrub_enumerate_chunks+0x10/0x10 ? bit_wait_timeout+0xb0/0x170 ? __up_read+0x189/0x700 ? scrub_workers_get+0x231/0x300 ? up_write+0x490/0x4f0 btrfs_scrub_dev+0x52e/0xcd0 ? create_pending_snapshots+0x230/0x250 ? __pfx_btrfs_scrub_dev+0x10/0x10 btrfs_dev_replace_by_ioctl+0xd69/0x1d00 ? lock_acquire+0x19d/0x4a0 ? __pfx_btrfs_dev_replace_by_ioctl+0x10/0x10 ? ---truncated---
In the Linux kernel, the following vulnerability has been resolved: RDMA/hns: Fix spin_unlock_irqrestore() called with IRQs enabled Fix missuse of spin_lock_irq()/spin_unlock_irq() when spin_lock_irqsave()/spin_lock_irqrestore() was hold. This was discovered through the lock debugging, and the corresponding log is as follows: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 96 PID: 2074 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40 ... Call trace: warn_bogus_irq_restore+0x30/0x40 _raw_spin_unlock_irqrestore+0x84/0xc8 add_qp_to_list+0x11c/0x148 [hns_roce_hw_v2] hns_roce_create_qp_common.constprop.0+0x240/0x780 [hns_roce_hw_v2] hns_roce_create_qp+0x98/0x160 [hns_roce_hw_v2] create_qp+0x138/0x258 ib_create_qp_kernel+0x50/0xe8 create_mad_qp+0xa8/0x128 ib_mad_port_open+0x218/0x448 ib_mad_init_device+0x70/0x1f8 add_client_context+0xfc/0x220 enable_device_and_get+0xd0/0x140 ib_register_device.part.0+0xf4/0x1c8 ib_register_device+0x34/0x50 hns_roce_register_device+0x174/0x3d0 [hns_roce_hw_v2] hns_roce_init+0xfc/0x2c0 [hns_roce_hw_v2] __hns_roce_hw_v2_init_instance+0x7c/0x1d0 [hns_roce_hw_v2] hns_roce_hw_v2_init_instance+0x9c/0x180 [hns_roce_hw_v2]
In the Linux kernel, the following vulnerability has been resolved: pinmux: Use sequential access to access desc->pinmux data When two client of the same gpio call pinctrl_select_state() for the same functionality, we are seeing NULL pointer issue while accessing desc->mux_owner. Let's say two processes A, B executing in pin_request() for the same pin and process A updates the desc->mux_usecount but not yet updated the desc->mux_owner while process B see the desc->mux_usecount which got updated by A path and further executes strcmp and while accessing desc->mux_owner it crashes with NULL pointer. Serialize the access to mux related setting with a mutex lock. cpu0 (process A) cpu1(process B) pinctrl_select_state() { pinctrl_select_state() { pin_request() { pin_request() { ... .... } else { desc->mux_usecount++; desc->mux_usecount && strcmp(desc->mux_owner, owner)) { if (desc->mux_usecount > 1) return 0; desc->mux_owner = owner; } }