In the Linux kernel, the following vulnerability has been resolved: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc() kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and always allocate memory using the hardcoded GFP_KERNEL flag. This makes them inconsistent with vmalloc(), which was recently extended to support GFP_NOFS and GFP_NOIO allocations. Page table allocations performed during shadow population also ignore the external gfp_mask. To preserve the intended semantics of GFP_NOFS and GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate memalloc scope. xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock. There was a report here https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com This patch: - Extends kasan_populate_vmalloc() and helpers to take gfp_mask; - Passes gfp_mask down to alloc_pages_bulk() and __get_free_page(); - Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore() around apply_to_page_range(); - Updates vmalloc.c and percpu allocator call sites accordingly.
In the Linux kernel, the following vulnerability has been resolved: net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work syzkaller was able to trigger a deadlock for NTF_MANAGED entries [0]: kworker/0:16/14617 is trying to acquire lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: ___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652 [...] but task is already holding lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: neigh_managed_work+0x35/0x250 net/core/neighbour.c:1572 The neighbor entry turned to NUD_FAILED state, where __neigh_event_send() triggered an immediate probe as per commit cd28ca0a3dd1 ("neigh: reduce arp latency") via neigh_probe() given table lock was held. One option to fix this situation is to defer the neigh_probe() back to the neigh_timer_handler() similarly as pre cd28ca0a3dd1. For the case of NTF_MANAGED, this deferral is acceptable given this only happens on actual failure state and regular / expected state is NUD_VALID with the entry already present. The fix adds a parameter to __neigh_event_send() in order to communicate whether immediate probe is allowed or disallowed. Existing call-sites of neigh_event_send() default as-is to immediate probe. However, the neigh_managed_work() disables it via use of neigh_event_send_probe(). [0] <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2956 [inline] check_deadlock kernel/locking/lockdep.c:2999 [inline] validate_chain kernel/locking/lockdep.c:3788 [inline] __lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5027 lock_acquire kernel/locking/lockdep.c:5639 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5604 __raw_write_lock_bh include/linux/rwlock_api_smp.h:202 [inline] _raw_write_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:334 ___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652 ip6_finish_output2+0x1070/0x14f0 net/ipv6/ip6_output.c:123 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:451 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0xa99/0x17f0 net/ipv6/ndisc.c:508 ndisc_send_ns+0x3a9/0x840 net/ipv6/ndisc.c:650 ndisc_solicit+0x2cd/0x4f0 net/ipv6/ndisc.c:742 neigh_probe+0xc2/0x110 net/core/neighbour.c:1040 __neigh_event_send+0x37d/0x1570 net/core/neighbour.c:1201 neigh_event_send include/net/neighbour.h:470 [inline] neigh_managed_work+0x162/0x250 net/core/neighbour.c:1574 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307 worker_thread+0x657/0x1110 kernel/workqueue.c:2454 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK>
In the Linux kernel, the following vulnerability has been resolved: IB/core: Fix a nested dead lock as part of ODP flow Fix a nested dead lock as part of ODP flow by using mmput_async(). From the below call trace [1] can see that calling mmput() once we have the umem_odp->umem_mutex locked as required by ib_umem_odp_map_dma_and_lock() might trigger in the same task the exit_mmap()->__mmu_notifier_release()->mlx5_ib_invalidate_range() which may dead lock when trying to lock the same mutex. Moving to use mmput_async() will solve the problem as the above exit_mmap() flow will be called in other task and will be executed once the lock will be available. [1] [64843.077665] task:kworker/u133:2 state:D stack: 0 pid:80906 ppid: 2 flags:0x00004000 [64843.077672] Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib] [64843.077719] Call Trace: [64843.077722] <TASK> [64843.077724] __schedule+0x23d/0x590 [64843.077729] schedule+0x4e/0xb0 [64843.077735] schedule_preempt_disabled+0xe/0x10 [64843.077740] __mutex_lock.constprop.0+0x263/0x490 [64843.077747] __mutex_lock_slowpath+0x13/0x20 [64843.077752] mutex_lock+0x34/0x40 [64843.077758] mlx5_ib_invalidate_range+0x48/0x270 [mlx5_ib] [64843.077808] __mmu_notifier_release+0x1a4/0x200 [64843.077816] exit_mmap+0x1bc/0x200 [64843.077822] ? walk_page_range+0x9c/0x120 [64843.077828] ? __cond_resched+0x1a/0x50 [64843.077833] ? mutex_lock+0x13/0x40 [64843.077839] ? uprobe_clear_state+0xac/0x120 [64843.077860] mmput+0x5f/0x140 [64843.077867] ib_umem_odp_map_dma_and_lock+0x21b/0x580 [ib_core] [64843.077931] pagefault_real_mr+0x9a/0x140 [mlx5_ib] [64843.077962] pagefault_mr+0xb4/0x550 [mlx5_ib] [64843.077992] pagefault_single_data_segment.constprop.0+0x2ac/0x560 [mlx5_ib] [64843.078022] mlx5_ib_eqe_pf_action+0x528/0x780 [mlx5_ib] [64843.078051] process_one_work+0x22b/0x3d0 [64843.078059] worker_thread+0x53/0x410 [64843.078065] ? process_one_work+0x3d0/0x3d0 [64843.078073] kthread+0x12a/0x150 [64843.078079] ? set_kthread_struct+0x50/0x50 [64843.078085] ret_from_fork+0x22/0x30 [64843.078093] </TASK>
In the Linux kernel, the following vulnerability has been resolved: drm/vc4: Fix deadlock on DSI device attach error DSI device attach to DSI host will be done with host device's lock held. Un-registering host in "device attach" error path (ex: probe retry) will result in deadlock with below call trace and non operational DSI display. Startup Call trace: [ 35.043036] rt_mutex_slowlock.constprop.21+0x184/0x1b8 [ 35.043048] mutex_lock_nested+0x7c/0xc8 [ 35.043060] device_del+0x4c/0x3e8 [ 35.043075] device_unregister+0x20/0x40 [ 35.043082] mipi_dsi_remove_device_fn+0x18/0x28 [ 35.043093] device_for_each_child+0x68/0xb0 [ 35.043105] mipi_dsi_host_unregister+0x40/0x90 [ 35.043115] vc4_dsi_host_attach+0xf0/0x120 [vc4] [ 35.043199] mipi_dsi_attach+0x30/0x48 [ 35.043209] tc358762_probe+0x128/0x164 [tc358762] [ 35.043225] mipi_dsi_drv_probe+0x28/0x38 [ 35.043234] really_probe+0xc0/0x318 [ 35.043244] __driver_probe_device+0x80/0xe8 [ 35.043254] driver_probe_device+0xb8/0x118 [ 35.043263] __device_attach_driver+0x98/0xe8 [ 35.043273] bus_for_each_drv+0x84/0xd8 [ 35.043281] __device_attach+0xf0/0x150 [ 35.043290] device_initial_probe+0x1c/0x28 [ 35.043300] bus_probe_device+0xa4/0xb0 [ 35.043308] deferred_probe_work_func+0xa0/0xe0 [ 35.043318] process_one_work+0x254/0x700 [ 35.043330] worker_thread+0x4c/0x448 [ 35.043339] kthread+0x19c/0x1a8 [ 35.043348] ret_from_fork+0x10/0x20 Shutdown Call trace: [ 365.565417] Call trace: [ 365.565423] __switch_to+0x148/0x200 [ 365.565452] __schedule+0x340/0x9c8 [ 365.565467] schedule+0x48/0x110 [ 365.565479] schedule_timeout+0x3b0/0x448 [ 365.565496] wait_for_completion+0xac/0x138 [ 365.565509] __flush_work+0x218/0x4e0 [ 365.565523] flush_work+0x1c/0x28 [ 365.565536] wait_for_device_probe+0x68/0x158 [ 365.565550] device_shutdown+0x24/0x348 [ 365.565561] kernel_restart_prepare+0x40/0x50 [ 365.565578] kernel_restart+0x20/0x70 [ 365.565591] __do_sys_reboot+0x10c/0x220 [ 365.565605] __arm64_sys_reboot+0x2c/0x38 [ 365.565619] invoke_syscall+0x4c/0x110 [ 365.565634] el0_svc_common.constprop.3+0xfc/0x120 [ 365.565648] do_el0_svc+0x2c/0x90 [ 365.565661] el0_svc+0x4c/0xf0 [ 365.565671] el0t_64_sync_handler+0x90/0xb8 [ 365.565682] el0t_64_sync+0x180/0x184
In the Linux kernel, the following vulnerability has been resolved: ALSA: timer: Don't take register_mutex with copy_from/to_user() The infamous mmap_lock taken in copy_from/to_user() can be often problematic when it's called inside another mutex, as they might lead to deadlocks. In the case of ALSA timer code, the bad pattern is with guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which was mistakenly introduced at converting to guard(), and it had been carefully worked around in the past. This patch fixes those pieces simply by moving copy_from/to_user() out of the register mutex lock again.
In the Linux kernel, the following vulnerability has been resolved: net: vlan: don't propagate flags on open With the device instance lock, there is now a possibility of a deadlock: [ 1.211455] ============================================ [ 1.211571] WARNING: possible recursive locking detected [ 1.211687] 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 Not tainted [ 1.211823] -------------------------------------------- [ 1.211936] ip/184 is trying to acquire lock: [ 1.212032] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_set_allmulti+0x4e/0xb0 [ 1.212207] [ 1.212207] but task is already holding lock: [ 1.212332] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0 [ 1.212487] [ 1.212487] other info that might help us debug this: [ 1.212626] Possible unsafe locking scenario: [ 1.212626] [ 1.212751] CPU0 [ 1.212815] ---- [ 1.212871] lock(&dev->lock); [ 1.212944] lock(&dev->lock); [ 1.213016] [ 1.213016] *** DEADLOCK *** [ 1.213016] [ 1.213143] May be due to missing lock nesting notation [ 1.213143] [ 1.213294] 3 locks held by ip/184: [ 1.213371] #0: ffffffff838b53e0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x1b/0xa0 [ 1.213543] #1: ffffffff84e5fc70 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x37/0xa0 [ 1.213727] #2: ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0 [ 1.213895] [ 1.213895] stack backtrace: [ 1.213991] CPU: 0 UID: 0 PID: 184 Comm: ip Not tainted 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 [ 1.213993] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 1.213994] Call Trace: [ 1.213995] <TASK> [ 1.213996] dump_stack_lvl+0x8e/0xd0 [ 1.214000] print_deadlock_bug+0x28b/0x2a0 [ 1.214020] lock_acquire+0xea/0x2a0 [ 1.214027] __mutex_lock+0xbf/0xd40 [ 1.214038] dev_set_allmulti+0x4e/0xb0 # real_dev->flags & IFF_ALLMULTI [ 1.214040] vlan_dev_open+0xa5/0x170 # ndo_open on vlandev [ 1.214042] __dev_open+0x145/0x270 [ 1.214046] __dev_change_flags+0xb0/0x1e0 [ 1.214051] netif_change_flags+0x22/0x60 # IFF_UP vlandev [ 1.214053] dev_change_flags+0x61/0xb0 # for each device in group from dev->vlan_info [ 1.214055] vlan_device_event+0x766/0x7c0 # on netdevsim0 [ 1.214058] notifier_call_chain+0x78/0x120 [ 1.214062] netif_open+0x6d/0x90 [ 1.214064] dev_open+0x5b/0xb0 # locks netdevsim0 [ 1.214066] bond_enslave+0x64c/0x1230 [ 1.214075] do_set_master+0x175/0x1e0 # on netdevsim0 [ 1.214077] do_setlink+0x516/0x13b0 [ 1.214094] rtnl_newlink+0xaba/0xb80 [ 1.214132] rtnetlink_rcv_msg+0x440/0x490 [ 1.214144] netlink_rcv_skb+0xeb/0x120 [ 1.214150] netlink_unicast+0x1f9/0x320 [ 1.214153] netlink_sendmsg+0x346/0x3f0 [ 1.214157] __sock_sendmsg+0x86/0xb0 [ 1.214160] ____sys_sendmsg+0x1c8/0x220 [ 1.214164] ___sys_sendmsg+0x28f/0x2d0 [ 1.214179] __x64_sys_sendmsg+0xef/0x140 [ 1.214184] do_syscall_64+0xec/0x1d0 [ 1.214190] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 1.214191] RIP: 0033:0x7f2d1b4a7e56 Device setup: netdevsim0 (down) ^ ^ bond netdevsim1.100@netdevsim1 allmulticast=on (down) When we enslave the lower device (netdevsim0) which has a vlan, we propagate vlan's allmuti/promisc flags during ndo_open. This causes (re)locking on of the real_dev. Propagate allmulti/promisc on flags change, not on the open. There is a slight semantics change that vlans that are down now propagate the flags, but this seems unlikely to result in the real issues. Reproducer: echo 0 1 > /sys/bus/netdevsim/new_device dev_path=$(ls -d /sys/bus/netdevsim/devices/netdevsim0/net/*) dev=$(echo $dev_path | rev | cut -d/ -f1 | rev) ip link set dev $dev name netdevsim0 ip link set dev netdevsim0 up ip link add link netdevsim0 name netdevsim0.100 type vlan id 100 ip link set dev netdevsim0.100 allm ---truncated---
In the Linux kernel, the following vulnerability has been resolved: net: switchdev: Convert blocking notification chain to a raw one A blocking notification chain uses a read-write semaphore to protect the integrity of the chain. The semaphore is acquired for writing when adding / removing notifiers to / from the chain and acquired for reading when traversing the chain and informing notifiers about an event. In case of the blocking switchdev notification chain, recursive notifications are possible which leads to the semaphore being acquired twice for reading and to lockdep warnings being generated [1]. Specifically, this can happen when the bridge driver processes a SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications about deferred events when calling switchdev_deferred_process(). Fix this by converting the notification chain to a raw notification chain in a similar fashion to the netdev notification chain. Protect the chain using the RTNL mutex by acquiring it when modifying the chain. Events are always informed under the RTNL mutex, but add an assertion in call_switchdev_blocking_notifiers() to make sure this is not violated in the future. Maintain the "blocking" prefix as events are always emitted from process context and listeners are allowed to block. [1]: WARNING: possible recursive locking detected 6.14.0-rc4-custom-g079270089484 #1 Not tainted -------------------------------------------- ip/52731 is trying to acquire lock: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0 but task is already holding lock: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((switchdev_blocking_notif_chain).rwsem); lock((switchdev_blocking_notif_chain).rwsem); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by ip/52731: #0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0 #1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0 #2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0 stack backtrace: ... ? __pfx_down_read+0x10/0x10 ? __pfx_mark_lock+0x10/0x10 ? __pfx_switchdev_port_attr_set_deferred+0x10/0x10 blocking_notifier_call_chain+0x58/0xa0 switchdev_port_attr_notify.constprop.0+0xb3/0x1b0 ? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10 ? mark_held_locks+0x94/0xe0 ? switchdev_deferred_process+0x11a/0x340 switchdev_port_attr_set_deferred+0x27/0xd0 switchdev_deferred_process+0x164/0x340 br_switchdev_port_unoffload+0xc8/0x100 [bridge] br_switchdev_blocking_event+0x29f/0x580 [bridge] notifier_call_chain+0xa2/0x440 blocking_notifier_call_chain+0x6e/0xa0 switchdev_bridge_port_unoffload+0xde/0x1a0 ...
In the Linux kernel, the following vulnerability has been resolved: sched_ext: Fix pick_task_scx() picking non-queued tasks when it's called without balance() a6250aa251ea ("sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx()") added a workaround to handle the cases where pick_task_scx() is called without prececing balance_scx() which is due to a fair class bug where pick_taks_fair() may return NULL after a true return from balance_fair(). The workaround detects when pick_task_scx() is called without preceding balance_scx() and emulates SCX_RQ_BAL_KEEP and triggers kicking to avoid stalling. Unfortunately, the workaround code was testing whether @prev was on SCX to decide whether to keep the task running. This is incorrect as the task may be on SCX but no longer runnable. This could lead to a non-runnable task to be returned from pick_task_scx() which cause interesting confusions and failures. e.g. A common failure mode is the task ending up with (!on_rq && on_cpu) state which can cause potential wakers to busy loop, which can easily lead to deadlocks. Fix it by testing whether @prev has SCX_TASK_QUEUED set. This makes @prev_on_scx only used in one place. Open code the usage and improve the comment while at it.
In the Linux kernel, the following vulnerability has been resolved: clocksource: Use migrate_disable() to avoid calling get_random_u32() in atomic context The following bug report happened with a PREEMPT_RT kernel: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 get_random_u32+0x4f/0x110 clocksource_verify_choose_cpus+0xab/0x1a0 clocksource_verify_percpu.part.0+0x6b/0x330 clocksource_watchdog_kthread+0x193/0x1a0 It is due to the fact that clocksource_verify_choose_cpus() is invoked with preemption disabled. This function invokes get_random_u32() to obtain random numbers for choosing CPUs. The batched_entropy_32 local lock and/or the base_crng.lock spinlock in driver/char/random.c will be acquired during the call. In PREEMPT_RT kernel, they are both sleeping locks and so cannot be acquired in atomic context. Fix this problem by using migrate_disable() to allow smp_processor_id() to be reliably used without introducing atomic context. preempt_disable() is then called after clocksource_verify_choose_cpus() but before the clocksource measurement is being run to avoid introducing unexpected latency.
In the Linux kernel, the following vulnerability has been resolved: drm/i915/gt: Use spin_lock_irqsave() in interruptible context spin_lock/unlock() functions used in interrupt contexts could result in a deadlock, as seen in GitLab issue #13399, which occurs when interrupt comes in while holding a lock. Try to remedy the problem by saving irq state before spin lock acquisition. v2: add irqs' state save/restore calls to all locks/unlocks in signal_irq_work() execution (Maciej) v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead of other lock/unlock calls and add Fixes and Cc tags (Tvrtko); change title and commit message (cherry picked from commit c088387ddd6482b40f21ccf23db1125e8fa4af7e)
In the Linux kernel, the following vulnerability has been resolved: hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem. Warning will be produced if folio is not locked during unmap: ------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]---
In the Linux kernel, the following vulnerability has been resolved: drm/imagination: avoid deadlock on fence release Do scheduler queue fence release processing on a workqueue, rather than in the release function itself. Fixes deadlock issues such as the following: [ 607.400437] ============================================ [ 607.405755] WARNING: possible recursive locking detected [ 607.415500] -------------------------------------------- [ 607.420817] weston:zfq0/24149 is trying to acquire lock: [ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr] [ 607.436728] but task is already holding lock: [ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554 [ 607.451727] other info that might help us debug this: [ 607.458245] Possible unsafe locking scenario: [ 607.464155] CPU0 [ 607.466601] ---- [ 607.469044] lock(reservation_ww_class_mutex); [ 607.473584] lock(reservation_ww_class_mutex); [ 607.478114] *** DEADLOCK ***
In the Linux kernel, the following vulnerability has been resolved: bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock There are multiple places from where the recovery work gets scheduled asynchronously. Also, there are multiple places where the caller waits synchronously for the recovery to be completed. One such place is during the PM shutdown() callback. If the device is not alive during recovery_work, it will try to reset the device using pci_reset_function(). This function internally will take the device_lock() first before resetting the device. By this time, if the lock has already been acquired, then recovery_work will get stalled while waiting for the lock. And if the lock was already acquired by the caller which waits for the recovery_work to be completed, it will lead to deadlock. This is what happened on the X1E80100 CRD device when the device died before shutdown() callback. Driver core calls the driver's shutdown() callback while holding the device_lock() leading to deadlock. And this deadlock scenario can occur on other paths as well, like during the PM suspend() callback, where the driver core would hold the device_lock() before calling driver's suspend() callback. And if the recovery_work was already started, it could lead to deadlock. This is also observed on the X1E80100 CRD. So to fix both issues, use pci_try_reset_function() in recovery_work. This function first checks for the availability of the device_lock() before trying to reset the device. If the lock is available, it will acquire it and reset the device. Otherwise, it will return -EAGAIN. If that happens, recovery_work will fail with the error message "Recovery failed" as not much could be done.
In the Linux kernel, the following vulnerability has been resolved: rxrpc, afs: Fix peer hash locking vs RCU callback In its address list, afs now retains pointers to and refs on one or more rxrpc_peer objects. The address list is freed under RCU and at this time, it puts the refs on those peers. Now, when an rxrpc_peer object runs out of refs, it gets removed from the peer hash table and, for that, rxrpc has to take a spinlock. However, it is now being called from afs's RCU cleanup, which takes place in BH context - but it is just taking an ordinary spinlock. The put may also be called from non-BH context, and so there exists the possibility of deadlock if the BH-based RCU cleanup happens whilst the hash spinlock is held. This led to the attached lockdep complaint. Fix this by changing spinlocks of rxnet->peer_hash_lock back to BH-disabling locks. ================================ WARNING: inconsistent lock state 6.13.0-rc5-build2+ #1223 Tainted: G E -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff88810babe228 (&rxnet->peer_hash_lock){+.?.}-{3:3}, at: rxrpc_put_peer+0xcb/0x180 {SOFTIRQ-ON-W} state was registered at: mark_usage+0x164/0x180 __lock_acquire+0x544/0x990 lock_acquire.part.0+0x103/0x280 _raw_spin_lock+0x2f/0x40 rxrpc_peer_keepalive_worker+0x144/0x440 process_one_work+0x486/0x7c0 process_scheduled_works+0x73/0x90 worker_thread+0x1c8/0x2a0 kthread+0x19b/0x1b0 ret_from_fork+0x24/0x40 ret_from_fork_asm+0x1a/0x30 irq event stamp: 972402 hardirqs last enabled at (972402): [<ffffffff8244360e>] _raw_spin_unlock_irqrestore+0x2e/0x50 hardirqs last disabled at (972401): [<ffffffff82443328>] _raw_spin_lock_irqsave+0x18/0x60 softirqs last enabled at (972300): [<ffffffff810ffbbe>] handle_softirqs+0x3ee/0x430 softirqs last disabled at (972313): [<ffffffff810ffc54>] __irq_exit_rcu+0x44/0x110 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&rxnet->peer_hash_lock); <Interrupt> lock(&rxnet->peer_hash_lock); *** DEADLOCK *** 1 lock held by swapper/1/0: #0: ffffffff83576be0 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire+0x7/0x30 stack backtrace: CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G E 6.13.0-rc5-build2+ #1223 Tainted: [E]=UNSIGNED_MODULE Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <IRQ> dump_stack_lvl+0x57/0x80 print_usage_bug.part.0+0x227/0x240 valid_state+0x53/0x70 mark_lock_irq+0xa5/0x2f0 mark_lock+0xf7/0x170 mark_usage+0xe1/0x180 __lock_acquire+0x544/0x990 lock_acquire.part.0+0x103/0x280 _raw_spin_lock+0x2f/0x40 rxrpc_put_peer+0xcb/0x180 afs_free_addrlist+0x46/0x90 [kafs] rcu_do_batch+0x2d2/0x640 rcu_core+0x2f7/0x350 handle_softirqs+0x1ee/0x430 __irq_exit_rcu+0x44/0x110 irq_exit_rcu+0xa/0x30 sysvec_apic_timer_interrupt+0x7f/0xa0 </IRQ>
In the Linux kernel, the following vulnerability has been resolved: USB: dummy-hcd: Fix locking/synchronization error Syzbot testing was able to provoke an addressing exception and crash in the usb_gadget_udc_reset() routine in drivers/usb/gadgets/udc/core.c, resulting from the fact that the routine was called with a second ("driver") argument of NULL. The bad caller was set_link_state() in dummy_hcd.c, and the problem arose because of a race between a USB reset and driver unbind. These sorts of races were not supposed to be possible; commit 7dbd8f4cabd9 ("USB: dummy-hcd: Fix erroneous synchronization change"), along with a few followup commits, was written specifically to prevent them. As it turns out, there are (at least) two errors remaining in the code. Another patch will address the second error; this one is concerned with the first. The error responsible for the syzbot crash occurred because the stop_activity() routine will sometimes drop and then re-acquire the dum->lock spinlock. A call to stop_activity() occurs in set_link_state() when handling an emulated USB reset, after the test of dum->ints_enabled and before the increment of dum->callback_usage. This allowed another thread (doing a driver unbind) to sneak in and grab the spinlock, and then clear dum->ints_enabled and dum->driver. Normally this other thread would have to wait for dum->callback_usage to go down to 0 before it would clear dum->driver, but in this case it didn't have to wait since dum->callback_usage had not yet been incremented. The fix is to increment dum->callback_usage _before_ calling stop_activity() instead of after. Then the thread doing the unbind will not clear dum->driver until after the call to usb_gadget_udc_reset() safely returns and dum->callback_usage has been decremented again.
In the Linux kernel, the following vulnerability has been resolved: dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue drain_workqueue() cannot be called safely in a spinlocked context due to possible task rescheduling. In the multi-task scenario, calling queue_work() while drain_workqueue() will lead to a Call Trace as pushing a work on a draining workqueue is not permitted in spinlocked context. Call Trace: <TASK> ? __warn+0x7d/0x140 ? __queue_work+0x2b2/0x440 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __queue_work+0x2b2/0x440 queue_work_on+0x28/0x30 idxd_misc_thread+0x303/0x5a0 [idxd] ? __schedule+0x369/0xb40 ? __pfx_irq_thread_fn+0x10/0x10 ? irq_thread+0xbc/0x1b0 irq_thread_fn+0x21/0x70 irq_thread+0x102/0x1b0 ? preempt_count_add+0x74/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> The current implementation uses a spinlock to protect event log workqueue and will lead to the Call Trace due to potential task rescheduling. To address the locking issue, convert the spinlock to mutex, allowing the drain_workqueue() to be called in a safe mutex-locked context. This change ensures proper synchronization when accessing the event log workqueue, preventing potential Call Trace and improving the overall robustness of the code.
In the Linux kernel, the following vulnerability has been resolved: PCI/ASPM: Fix deadlock when enabling ASPM A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice.
In the Linux kernel, the following vulnerability has been resolved: hfsplus: remove mutex_lock check in hfsplus_free_extents Syzbot reported an issue in hfsplus filesystem: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4400 at fs/hfsplus/extents.c:346 hfsplus_free_extents+0x700/0xad0 Call Trace: <TASK> hfsplus_file_truncate+0x768/0xbb0 fs/hfsplus/extents.c:606 hfsplus_write_begin+0xc2/0xd0 fs/hfsplus/inode.c:56 cont_expand_zero fs/buffer.c:2383 [inline] cont_write_begin+0x2cf/0x860 fs/buffer.c:2446 hfsplus_write_begin+0x86/0xd0 fs/hfsplus/inode.c:52 generic_cont_expand_simple+0x151/0x250 fs/buffer.c:2347 hfsplus_setattr+0x168/0x280 fs/hfsplus/inode.c:263 notify_change+0xe38/0x10f0 fs/attr.c:420 do_truncate+0x1fb/0x2e0 fs/open.c:65 do_sys_ftruncate+0x2eb/0x380 fs/open.c:193 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd To avoid deadlock, Commit 31651c607151 ("hfsplus: avoid deadlock on file truncation") unlock extree before hfsplus_free_extents(), and add check wheather extree is locked in hfsplus_free_extents(). However, when operations such as hfsplus_file_release, hfsplus_setattr, hfsplus_unlink, and hfsplus_get_block are executed concurrently in different files, it is very likely to trigger the WARN_ON, which will lead syzbot and xfstest to consider it as an abnormality. The comment above this warning also describes one of the easy triggering situations, which can easily trigger and cause xfstest&syzbot to report errors. [task A] [task B] ->hfsplus_file_release ->hfsplus_file_truncate ->hfs_find_init ->mutex_lock ->mutex_unlock ->hfsplus_write_begin ->hfsplus_get_block ->hfsplus_file_extend ->hfsplus_ext_read_extent ->hfs_find_init ->mutex_lock ->hfsplus_free_extents WARN_ON(mutex_is_locked) !!! Several threads could try to lock the shared extents tree. And warning can be triggered in one thread when another thread has locked the tree. This is the wrong behavior of the code and we need to remove the warning.
In the Linux kernel, the following vulnerability has been resolved: powerpc/imc-pmu: Fix use of mutex in IRQs disabled section Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event. Command to trigger the warning: # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5 Performance counter stats for 'sleep 5': 0 thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ 5.002117947 seconds time elapsed 0.000131000 seconds user 0.001063000 seconds sys Below is snippet of the warning in dmesg: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec preempt_count: 2, expected: 0 4 locks held by perf-exec/2869: #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90 #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0 #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510 #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510 irq event stamp: 4806 hardirqs last enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0 hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510 softirqs last enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61 Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV Call Trace: dump_stack_lvl+0x98/0xe0 (unreliable) __might_resched+0x2f8/0x310 __mutex_lock+0x6c/0x13f0 thread_imc_event_add+0xf4/0x1b0 event_sched_in+0xe0/0x210 merge_sched_in+0x1f0/0x600 visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0 ctx_flexible_sched_in+0xcc/0x140 ctx_sched_in+0x20c/0x2a0 ctx_resched+0x104/0x1c0 perf_event_exec+0x340/0x510 begin_new_exec+0x730/0xef0 load_elf_binary+0x3f8/0x1e10 ... do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0 WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0 CPU: 36 PID: 2869 Comm: sleep Tainted: G W 6.2.0-rc2-00011-g1247637727f2 #61 Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV NIP: c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670 REGS: c00000004d2134e0 TRAP: 0700 Tainted: G W (6.2.0-rc2-00011-g1247637727f2) MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 48002824 XER: 00000000 CFAR: c00000000013fb64 IRQMASK: 1 The above warning triggered because the current imc-pmu code uses mutex lock in interrupt disabled sections. The function mutex_lock() internally calls __might_resched(), which will check if IRQs are disabled and in case IRQs are disabled, it will trigger the warning. Fix the issue by changing the mutex lock to spinlock. [mpe: Fix comments, trim oops in change log, add reported-by tags]
In the Linux kernel before 4.16.4, a double-locking error in drivers/usb/dwc3/gadget.c may potentially cause a deadlock with f_hid.
A denial of service vulnerability due to a deadlock was found in sctp_auto_asconf_init in net/sctp/socket.c in the Linux kernel’s SCTP subsystem. This flaw allows guests with local user privileges to trigger a deadlock and potentially crash the system.
In the Linux kernel, the following vulnerability has been resolved: usb: dwc2: gadget: Fix spin_lock/unlock mismatch in dwc2_hsotg_udc_stop() dwc2_gadget_exit_clock_gating() internally calls call_gadget() macro, which expects hsotg->lock to be held since it does spin_unlock/spin_lock around the gadget driver callback invocation. However, dwc2_hsotg_udc_stop() calls dwc2_gadget_exit_clock_gating() without holding the lock. This leads to: - spin_unlock on a lock that is not held (undefined behavior) - The lock remaining held after dwc2_gadget_exit_clock_gating() returns, causing a deadlock when spin_lock_irqsave() is called later in the same function. Fix this by acquiring hsotg->lock before calling dwc2_gadget_exit_clock_gating() and releasing it afterwards, which satisfies the locking requirement of the call_gadget() macro.
A denial of service vulnerability was found in tipc_crypto_key_revoke in net/tipc/crypto.c in the Linux kernel’s TIPC subsystem. This flaw allows guests with local user privileges to trigger a deadlock and potentially crash the system.
In the Linux kernel, the following vulnerability has been resolved: RDMA/irdma: Fix deadlock during netdev reset with active connections Resolve deadlock that occurs when user executes netdev reset while RDMA applications (e.g., rping) are active. The netdev reset causes ice driver to remove irdma auxiliary driver, triggering device_delete and subsequent client removal. During client removal, uverbs_client waits for QP reference count to reach zero while cma_client holds the final reference, creating circular dependency and indefinite wait in iWARP mode. Skip QP reference count wait during device reset to prevent deadlock.
In the Linux kernel, the following vulnerability has been resolved: gpio: omap: do not register driver in probe() Commit 11a78b794496 ("ARM: OMAP: MPUIO wake updates") registers the omap_mpuio_driver from omap_mpuio_init(), which is called from omap_gpio_probe(). However, it neither makes sense to register drivers from probe() callbacks of other drivers, nor does the driver core allow registering drivers with a device lock already being held. The latter was revealed by commit dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") leading to a potential deadlock condition described in [1]. Additionally, the omap_mpuio_driver is never unregistered from the driver core, even if the module is unloaded. Hence, register the omap_mpuio_driver from the module initcall and unregister it in module_exit().
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix exception exit lock checking for subprogs process_bpf_exit_full() passes check_lock = !curframe to check_resource_leak(), which is false in cases when bpf_throw() is called from a static subprog. This makes check_resource_leak() to skip validation of active_rcu_locks, active_preempt_locks, and active_irq_id on exception exits from subprogs. At runtime bpf_throw() unwinds the stack via ORC without releasing any user-acquired locks, which may cause various issues as the result. Fix by setting check_lock = true for exception exits regardless of curframe, since exceptions bypass all intermediate frame cleanup. Update the error message prefix to "bpf_throw" for exception exits to distinguish them from normal BPF_EXIT. Fix reject_subprog_with_rcu_read_lock test which was previously passing for the wrong reason. Test program returned directly from the subprog call without closing the RCU section, so the error was triggered by the unclosed RCU lock on normal exit, not by bpf_throw. Update __msg annotations for affected tests to match the new "bpf_throw" error prefix. The spin_lock case is not affected because they are already checked [1] at the call site in do_check_insn() before bpf_throw can run. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v7.0-rc4#n21098
In the Linux kernel, the following vulnerability has been resolved: spi: use generic driver_override infrastructure When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Also note that we do not enable the driver_override feature of struct bus_type, as SPI - in contrast to most other buses - passes "" to sysfs_emit() when the driver_override pointer is NULL. Thus, printing "\n" instead of "(null)\n".
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del() l2cap_conn_del() calls cancel_delayed_work_sync() for both info_timer and id_addr_timer while holding conn->lock. However, the work functions l2cap_info_timeout() and l2cap_conn_update_id_addr() both acquire conn->lock, creating a potential AB-BA deadlock if the work is already executing when l2cap_conn_del() takes the lock. Move the work cancellations before acquiring conn->lock and use disable_delayed_work_sync() to additionally prevent the works from being rearmed after cancellation, consistent with the pattern used in hci_conn_del().
In the Linux kernel, the following vulnerability has been resolved: tracing: Fix potential deadlock in cpu hotplug with osnoise The following sequence may leads deadlock in cpu hotplug: task1 task2 task3 ----- ----- ----- mutex_lock(&interface_lock) [CPU GOING OFFLINE] cpus_write_lock(); osnoise_cpu_die(); kthread_stop(task3); wait_for_completion(); osnoise_sleep(); mutex_lock(&interface_lock); cpus_read_lock(); [DEAD LOCK] Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).
In the Linux kernel, the following vulnerability has been resolved: nfc: nci: fix circular locking dependency in nci_close_device nci_close_device() flushes rx_wq and tx_wq while holding req_lock. This causes a circular locking dependency because nci_rx_work() running on rx_wq can end up taking req_lock too: nci_rx_work -> nci_rx_data_packet -> nci_data_exchange_complete -> __sk_destruct -> rawsock_destruct -> nfc_deactivate_target -> nci_deactivate_target -> nci_request -> mutex_lock(&ndev->req_lock) Move the flush of rx_wq after req_lock has been released. This should safe (I think) because NCI_UP has already been cleared and the transport is closed, so the work will see it and return -ENETDOWN. NIPA has been hitting this running the nci selftest with a debug kernel on roughly 4% of the runs.
In the Linux kernel, the following vulnerability has been resolved: mac80211: fix deadlock in AP/VLAN handling Syzbot reports that when you have AP_VLAN interfaces that are up and close the AP interface they belong to, we get a deadlock. No surprise - since we dev_close() them with the wiphy mutex held, which goes back into the netdev notifier in cfg80211 and tries to acquire the wiphy mutex there. To fix this, we need to do two things: 1) prevent changing iftype while AP_VLANs are up, we can't easily fix this case since cfg80211 already calls us with the wiphy mutex held, but change_interface() is relatively rare in drivers anyway, so changing iftype isn't used much (and userspace has to fall back to down/change/up anyway) 2) pull the dev_close() loop over VLANs out of the wiphy mutex section in the normal stop case
In the Linux kernel, the following vulnerability has been resolved: usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler Patch fixes the following critical issue caused by deadlock which has been detected during testing NCM class: smp: csd: Detected non-responsive CSD lock (#1) on CPU#0 smp: csd: CSD lock (#1) unresponsive. .... RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0 RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002 RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658 RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658 R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0 Call Trace: <IRQ> do_raw_spin_lock+0xc0/0xd0 _raw_spin_lock_irqsave+0x95/0xa0 cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci] usb_ep_queue+0x35/0x110 eth_start_xmit+0x220/0x3d0 [u_ether] ncm_tx_timeout+0x34/0x40 [usb_f_ncm] ? ncm_free_inst+0x50/0x50 [usb_f_ncm] __hrtimer_run_queues+0xac/0x440 hrtimer_run_softirq+0x8c/0xb0 __do_softirq+0xcf/0x428 asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x61/0x70 irq_exit_rcu+0xc1/0xd0 sysvec_apic_timer_interrupt+0x52/0xb0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:do_raw_spin_trylock+0x18/0x40 RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658 RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658 R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000 _raw_spin_lock+0x3d/0x70 ? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci] cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci] ? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci] ? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci] ? irq_thread+0xa0/0x1c0 irq_thread_fn+0x28/0x60 irq_thread+0x105/0x1c0 ? __kthread_parkme+0x42/0x90 ? irq_forced_thread_fn+0x90/0x90 ? wake_threads_waitq+0x30/0x30 ? irq_thread_check_affinity+0xe0/0xe0 kthread+0x12a/0x160 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 The root cause of issue is spin_lock/spin_unlock instruction instead spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler function.
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen.
In the Linux kernel, the following vulnerability has been resolved: tipc: wait and exit until all work queues are done On some host, a crash could be triggered simply by repeating these commands several times: # modprobe tipc # tipc bearer enable media udp name UDP1 localip 127.0.0.1 # rmmod tipc [] BUG: unable to handle kernel paging request at ffffffffc096bb00 [] Workqueue: events 0xffffffffc096bb00 [] Call Trace: [] ? process_one_work+0x1a7/0x360 [] ? worker_thread+0x30/0x390 [] ? create_worker+0x1a0/0x1a0 [] ? kthread+0x116/0x130 [] ? kthread_flush_work_fn+0x10/0x10 [] ? ret_from_fork+0x35/0x40 When removing the TIPC module, the UDP tunnel sock will be delayed to release in a work queue as sock_release() can't be done in rtnl_lock(). If the work queue is schedule to run after the TIPC module is removed, kernel will crash as the work queue function cleanup_beareri() code no longer exists when trying to invoke it. To fix it, this patch introduce a member wq_count in tipc_net to track the numbers of work queues in schedule, and wait and exit until all work queues are done in tipc_exit_net().
In the Linux kernel, the following vulnerability has been resolved: cifs: Fix soft lockup during fsstress Below traces are observed during fsstress and system got hung. [ 130.698396] watchdog: BUG: soft lockup - CPU#6 stuck for 26s!
In the Linux kernel, the following vulnerability has been resolved: scsi: ufs: Fix a deadlock in the error handler The following deadlock has been observed on a test setup: - All tags allocated - The SCSI error handler calls ufshcd_eh_host_reset_handler() - ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler() - ufshcd_err_handler() locks up as follows: Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: __switch_to+0x298/0x5d8 __schedule+0x6cc/0xa94 schedule+0x12c/0x298 blk_mq_get_tag+0x210/0x480 __blk_mq_alloc_request+0x1c8/0x284 blk_get_request+0x74/0x134 ufshcd_exec_dev_cmd+0x68/0x640 ufshcd_verify_dev_init+0x68/0x35c ufshcd_probe_hba+0x12c/0x1cb8 ufshcd_host_reset_and_restore+0x88/0x254 ufshcd_reset_and_restore+0xd0/0x354 ufshcd_err_handler+0x408/0xc58 process_one_work+0x24c/0x66c worker_thread+0x3e8/0xa4c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request.
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: avoid deadlock between hci_dev->lock and socket lock Commit eab2404ba798 ("Bluetooth: Add BT_PHY socket option") added a dependency between socket lock and hci_dev->lock that could lead to deadlock. It turns out that hci_conn_get_phy() is not in any way relying on hdev being immutable during the runtime of this function, neither does it even look at any of the members of hdev, and as such there is no need to hold that lock. This fixes the lockdep splat below: ====================================================== WARNING: possible circular locking dependency detected 5.12.0-rc1-00026-g73d464503354 #10 Not tainted ------------------------------------------------------ bluetoothd/1118 is trying to acquire lock: ffff8f078383c078 (&hdev->lock){+.+.}-{3:3}, at: hci_conn_get_phy+0x1c/0x150 [bluetooth] but task is already holding lock: ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}: lock_sock_nested+0x72/0xa0 l2cap_sock_ready_cb+0x18/0x70 [bluetooth] l2cap_config_rsp+0x27a/0x520 [bluetooth] l2cap_sig_channel+0x658/0x1330 [bluetooth] l2cap_recv_frame+0x1ba/0x310 [bluetooth] hci_rx_work+0x1cc/0x640 [bluetooth] process_one_work+0x244/0x5f0 worker_thread+0x3c/0x380 kthread+0x13e/0x160 ret_from_fork+0x22/0x30 -> #2 (&chan->lock#2/1){+.+.}-{3:3}: __mutex_lock+0xa3/0xa10 l2cap_chan_connect+0x33a/0x940 [bluetooth] l2cap_sock_connect+0x141/0x2a0 [bluetooth] __sys_connect+0x9b/0xc0 __x64_sys_connect+0x16/0x20 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #1 (&conn->chan_lock){+.+.}-{3:3}: __mutex_lock+0xa3/0xa10 l2cap_chan_connect+0x322/0x940 [bluetooth] l2cap_sock_connect+0x141/0x2a0 [bluetooth] __sys_connect+0x9b/0xc0 __x64_sys_connect+0x16/0x20 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&hdev->lock){+.+.}-{3:3}: __lock_acquire+0x147a/0x1a50 lock_acquire+0x277/0x3d0 __mutex_lock+0xa3/0xa10 hci_conn_get_phy+0x1c/0x150 [bluetooth] l2cap_sock_getsockopt+0x5a9/0x610 [bluetooth] __sys_getsockopt+0xcc/0x200 __x64_sys_getsockopt+0x20/0x30 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &hdev->lock --> &chan->lock#2/1 --> sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP); lock(&chan->lock#2/1); lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP); lock(&hdev->lock); *** DEADLOCK *** 1 lock held by bluetoothd/1118: #0: ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610 [bluetooth] stack backtrace: CPU: 3 PID: 1118 Comm: bluetoothd Not tainted 5.12.0-rc1-00026-g73d464503354 #10 Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017 Call Trace: dump_stack+0x7f/0xa1 check_noncircular+0x105/0x120 ? __lock_acquire+0x147a/0x1a50 __lock_acquire+0x147a/0x1a50 lock_acquire+0x277/0x3d0 ? hci_conn_get_phy+0x1c/0x150 [bluetooth] ? __lock_acquire+0x2e1/0x1a50 ? lock_is_held_type+0xb4/0x120 ? hci_conn_get_phy+0x1c/0x150 [bluetooth] __mutex_lock+0xa3/0xa10 ? hci_conn_get_phy+0x1c/0x150 [bluetooth] ? lock_acquire+0x277/0x3d0 ? mark_held_locks+0x49/0x70 ? mark_held_locks+0x49/0x70 ? hci_conn_get_phy+0x1c/0x150 [bluetooth] hci_conn_get_phy+0x ---truncated---
In the Linux kernel, the following vulnerability has been resolved: iio: adis16475: fix deadlock on frequency set With commit 39c024b51b560 ("iio: adis16475: improve sync scale mode handling"), two deadlocks were introduced: 1) The call to 'adis_write_reg_16()' was not changed to it's unlocked version. 2) The lock was not being released on the success path of the function. This change fixes both these issues.
In the Linux kernel, the following vulnerability has been resolved: mac80211: fix locking in ieee80211_start_ap error path We need to hold the local->mtx to release the channel context, as even encoded by the lockdep_assert_held() there. Fix it.
In the Linux kernel, the following vulnerability has been resolved: powerpc/set_memory: Avoid spinlock recursion in change_page_attr() Commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") included a spin_lock() to change_page_attr() in order to safely perform the three step operations. But then commit 9f7853d7609d ("powerpc/mm: Fix set_memory_*() against concurrent accesses") modify it to use pte_update() and do the operation safely against concurrent access. In the meantime, Maxime reported some spinlock recursion. [ 15.351649] BUG: spinlock recursion on CPU#0, kworker/0:2/217 [ 15.357540] lock: init_mm+0x3c/0x420, .magic: dead4ead, .owner: kworker/0:2/217, .owner_cpu: 0 [ 15.366563] CPU: 0 PID: 217 Comm: kworker/0:2 Not tainted 5.15.0+ #523 [ 15.373350] Workqueue: events do_free_init [ 15.377615] Call Trace: [ 15.380232] [e4105ac0] [800946a4] do_raw_spin_lock+0xf8/0x120 (unreliable) [ 15.387340] [e4105ae0] [8001f4ec] change_page_attr+0x40/0x1d4 [ 15.393413] [e4105b10] [801424e0] __apply_to_page_range+0x164/0x310 [ 15.400009] [e4105b60] [80169620] free_pcp_prepare+0x1e4/0x4a0 [ 15.406045] [e4105ba0] [8016c5a0] free_unref_page+0x40/0x2b8 [ 15.411979] [e4105be0] [8018724c] kasan_depopulate_vmalloc_pte+0x6c/0x94 [ 15.418989] [e4105c00] [801424e0] __apply_to_page_range+0x164/0x310 [ 15.425451] [e4105c50] [80187834] kasan_release_vmalloc+0xbc/0x134 [ 15.431898] [e4105c70] [8015f7a8] __purge_vmap_area_lazy+0x4e4/0xdd8 [ 15.438560] [e4105d30] [80160d10] _vm_unmap_aliases.part.0+0x17c/0x24c [ 15.445283] [e4105d60] [801642d0] __vunmap+0x2f0/0x5c8 [ 15.450684] [e4105db0] [800e32d0] do_free_init+0x68/0x94 [ 15.456181] [e4105dd0] [8005d094] process_one_work+0x4bc/0x7b8 [ 15.462283] [e4105e90] [8005d614] worker_thread+0x284/0x6e8 [ 15.468227] [e4105f00] [8006aaec] kthread+0x1f0/0x210 [ 15.473489] [e4105f40] [80017148] ret_from_kernel_thread+0x14/0x1c Remove the read / modify / write sequence to make the operation atomic and remove the spin_lock() in change_page_attr(). To do the operation atomically, we can't use pte modification helpers anymore. Because all platforms have different combination of bits, it is not easy to use those bits directly. But all have the _PAGE_KERNEL_{RO/ROX/RW/RWX} set of flags. All we need it to compare two sets to know which bits are set or cleared. For instance, by comparing _PAGE_KERNEL_ROX and _PAGE_KERNEL_RO you know which bit gets cleared and which bit get set when changing exec permission.
In the Linux kernel, the following vulnerability has been resolved: hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() The acpi_power_meter driver's .notify() callback function, acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock that is also acquired by callbacks in sysfs attributes of the device being unregistered which is prone to deadlocks between sysfs access and device removal. Address this by moving the hwmon device removal in acpi_power_meter_notify() outside the lock in question, but notice that doing it alone is not sufficient because two concurrent METER_NOTIFY_CONFIG notifications may be attempting to remove the same device at the same time. To prevent that from happening, add a new lock serializing the execution of the switch () statement in acpi_power_meter_notify(). For simplicity, it is a static mutex which should not be a problem from the performance perspective. The new lock also allows the hwmon_device_register_with_info() in acpi_power_meter_notify() to be called outside the inner lock because it prevents the other notifications handled by that function from manipulating the "resource" object while the hwmon device based on it is being registered. The sending of ACPI netlink messages from acpi_power_meter_notify() is serialized by the new lock too which generally helps to ensure that the order of handling firmware notifications is the same as the order of sending netlink messages related to them. In addition, notice that hwmon_device_register_with_info() may fail in which case resource->hwmon_dev will become an error pointer, so add checks to avoid attempting to unregister the hwmon device pointer to by it in that case to acpi_power_meter_notify() and acpi_power_meter_remove().
In the Linux kernel, the following vulnerability has been resolved: net: usb: r8152: fix resume reset deadlock rtl8152 can trigger device reset during reset which potentially can result in a deadlock: **** DPM device timeout after 10 seconds; 15 seconds until panic **** Call Trace: <TASK> schedule+0x483/0x1370 schedule_preempt_disabled+0x15/0x30 __mutex_lock_common+0x1fd/0x470 __rtl8152_set_mac_address+0x80/0x1f0 dev_set_mac_address+0x7f/0x150 rtl8152_post_reset+0x72/0x150 usb_reset_device+0x1d0/0x220 rtl8152_resume+0x99/0xc0 usb_resume_interface+0x3e/0xc0 usb_resume_both+0x104/0x150 usb_resume+0x22/0x110 The problem is that rtl8152 resume calls reset under tp->control mutex while reset basically re-enters rtl8152 and attempts to acquire the same tp->control lock once again. Reset INACCESSIBLE device outside of tp->control mutex scope to avoid recursive mutex_lock() deadlock.
In the Linux kernel, the following vulnerability has been resolved: sfc: fix deadlock in RSS config read Since cited commit, core locks the net_device's rss_lock when handling ethtool -x command, so driver's implementation should not lock it again. Remove the latter.
In the Linux kernel, the following vulnerability has been resolved: can: mcp251x: fix deadlock in error path of mcp251x_open The mcp251x_open() function call free_irq() in its error path with the mpc_lock mutex held. But if an interrupt already occurred the interrupt handler will be waiting for the mpc_lock and free_irq() will deadlock waiting for the handler to finish. This issue is similar to the one fixed in commit 7dd9c26bd6cf ("can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open") but for the error path. To solve this issue move the call to free_irq() after the lock is released. Setting `priv->force_quit = 1` beforehand ensure that the IRQ handler will exit right away once it acquired the lock.
In the Linux kernel, the following vulnerability has been resolved: wifi: ath12k: fix dead lock while flushing management frames Commit [1] converted the management transmission work item into a wiphy work. Since a wiphy work can only run under wiphy lock protection, a race condition happens in below scenario: 1. a management frame is queued for transmission. 2. ath12k_mac_op_flush() gets called to flush pending frames associated with the hardware (i.e, vif being NULL). Then in ath12k_mac_flush() the process waits for the transmission done. 3. Since wiphy lock has been taken by the flush process, the transmission work item has no chance to run, hence the dead lock. >From user view, this dead lock results in below issue: wlp8s0: authenticate with xxxxxx (local address=xxxxxx) wlp8s0: send auth to xxxxxx (try 1/3) wlp8s0: authenticate with xxxxxx (local address=xxxxxx) wlp8s0: send auth to xxxxxx (try 1/3) wlp8s0: authenticated wlp8s0: associate with xxxxxx (try 1/3) wlp8s0: aborting association with xxxxxx by local choice (Reason: 3=DEAUTH_LEAVING) ath12k_pci 0000:08:00.0: failed to flush mgmt transmit queue, mgmt pkts pending 1 The dead lock can be avoided by invoking wiphy_work_flush() to proactively run the queued work item. Note actually it is already present in ath12k_mac_op_flush(), however it does not protect the case where vif being NULL. Hence move it ahead to cover this case as well. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
In the Linux kernel, the following vulnerability has been resolved: net/rds: Fix circular locking dependency in rds_tcp_tune syzbot reported a circular locking dependency in rds_tcp_tune() where sk_net_refcnt_upgrade() is called while holding the socket lock: ====================================================== WARNING: possible circular locking dependency detected ====================================================== kworker/u10:8/15040 is trying to acquire lock: ffffffff8e9aaf80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x4b/0x6f0 but task is already holding lock: ffff88805a3c1ce0 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: rds_tcp_tune+0xd7/0x930 The issue occurs because sk_net_refcnt_upgrade() performs memory allocation (via get_net_track() -> ref_tracker_alloc()) while the socket lock is held, creating a circular dependency with fs_reclaim. Fix this by moving sk_net_refcnt_upgrade() outside the socket lock critical section. This is safe because the fields modified by the sk_net_refcnt_upgrade() call (sk_net_refcnt, ns_tracker) are not accessed by any concurrent code path at this point. v2: - Corrected fixes tag - check patch line wrap nits - ai commentary nits
In the Linux kernel, the following vulnerability has been resolved: riscv: trace: fix snapshot deadlock with sbi ecall If sbi_ecall.c's functions are traceable, echo "__sbi_ecall:snapshot" > /sys/kernel/tracing/set_ftrace_filter may get the kernel into a deadlock. (Functions in sbi_ecall.c are excluded from tracing if CONFIG_RISCV_ALTERNATIVE_EARLY is set.) __sbi_ecall triggers a snapshot of the ringbuffer. The snapshot code raises an IPI interrupt, which results in another call to __sbi_ecall and another snapshot... All it takes to get into this endless loop is one initial __sbi_ecall. On RISC-V systems without SSTC extension, the clock events in timer-riscv.c issue periodic sbi ecalls, making the problem easy to trigger. Always exclude the sbi_ecall.c functions from tracing to fix the potential deadlock. sbi ecalls can easiliy be logged via trace events, excluding ecall functions from function tracing is not a big limitation.
In the Linux kernel, the following vulnerability has been resolved: rust_binder: call set_notification_done() without proc lock Consider the following sequence of events on a death listener: 1. The remote process dies and sends a BR_DEAD_BINDER message. 2. The local process invokes the BC_CLEAR_DEATH_NOTIFICATION command. 3. The local process then invokes the BC_DEAD_BINDER_DONE. Then, the kernel will reply to the BC_DEAD_BINDER_DONE command with a BR_CLEAR_DEATH_NOTIFICATION_DONE reply using push_work_if_looper(). However, this can result in a deadlock if the current thread is not a looper. This is because dead_binder_done() still holds the proc lock during set_notification_done(), which called push_work_if_looper(). Normally, push_work_if_looper() takes the thread lock, which is fine to take under the proc lock. But if the current thread is not a looper, then it falls back to delivering the reply to the process work queue, which involves taking the proc lock. Since the proc lock is already held, this is a deadlock. Fix this by releasing the proc lock during set_notification_done(). It was not intentional that it was held during that function to begin with. I don't think this ever happens in Android because BC_DEAD_BINDER_DONE is only invoked in response to BR_DEAD_BINDER messages, and the kernel always delivers BR_DEAD_BINDER to a looper. So there's no scenario where Android userspace will call BC_DEAD_BINDER_DONE on a non-looper thread.
In the Linux kernel, the following vulnerability has been resolved: net: phy: register phy led_triggers during probe to avoid AB-BA deadlock There is an AB-BA deadlock when both LEDS_TRIGGER_NETDEV and LED_TRIGGER_PHY are enabled: [ 1362.049207] [<8054e4b8>] led_trigger_register+0x5c/0x1fc <-- Trying to get lock "triggers_list_lock" via down_write(&triggers_list_lock); [ 1362.054536] [<80662830>] phy_led_triggers_register+0xd0/0x234 [ 1362.060329] [<8065e200>] phy_attach_direct+0x33c/0x40c [ 1362.065489] [<80651fc4>] phylink_fwnode_phy_connect+0x15c/0x23c [ 1362.071480] [<8066ee18>] mtk_open+0x7c/0xba0 [ 1362.075849] [<806d714c>] __dev_open+0x280/0x2b0 [ 1362.080384] [<806d7668>] __dev_change_flags+0x244/0x24c [ 1362.085598] [<806d7698>] dev_change_flags+0x28/0x78 [ 1362.090528] [<807150e4>] dev_ioctl+0x4c0/0x654 <-- Hold lock "rtnl_mutex" by calling rtnl_lock(); [ 1362.094985] [<80694360>] sock_ioctl+0x2f4/0x4e0 [ 1362.099567] [<802e9c4c>] sys_ioctl+0x32c/0xd8c [ 1362.104022] [<80014504>] syscall_common+0x34/0x58 Here LED_TRIGGER_PHY is registering LED triggers during phy_attach while holding RTNL and then taking triggers_list_lock. [ 1362.191101] [<806c2640>] register_netdevice_notifier+0x60/0x168 <-- Trying to get lock "rtnl_mutex" via rtnl_lock(); [ 1362.197073] [<805504ac>] netdev_trig_activate+0x194/0x1e4 [ 1362.202490] [<8054e28c>] led_trigger_set+0x1d4/0x360 <-- Hold lock "triggers_list_lock" by down_read(&triggers_list_lock); [ 1362.207511] [<8054eb38>] led_trigger_write+0xd8/0x14c [ 1362.212566] [<80381d98>] sysfs_kf_bin_write+0x80/0xbc [ 1362.217688] [<8037fcd8>] kernfs_fop_write_iter+0x17c/0x28c [ 1362.223174] [<802cbd70>] vfs_write+0x21c/0x3c4 [ 1362.227712] [<802cc0c4>] ksys_write+0x78/0x12c [ 1362.232164] [<80014504>] syscall_common+0x34/0x58 Here LEDS_TRIGGER_NETDEV is being enabled on an LED. It first takes triggers_list_lock and then RTNL. A classical AB-BA deadlock. phy_led_triggers_registers() does not require the RTNL, it does not make any calls into the network stack which require protection. There is also no requirement the PHY has been attached to a MAC, the triggers only make use of phydev state. This allows the call to phy_led_triggers_registers() to be placed elsewhere. PHY probe() and release() don't hold RTNL, so solving the AB-BA deadlock.
In the Linux kernel, the following vulnerability has been resolved: can: bcm: fix locking for bcm_op runtime updates Commit c2aba69d0c36 ("can: bcm: add locking for bcm_op runtime updates") added a locking for some variables that can be modified at runtime when updating the sending bcm_op with a new TX_SETUP command in bcm_tx_setup(). Usually the RX_SETUP only handles and filters incoming traffic with one exception: When the RX_RTR_FRAME flag is set a predefined CAN frame is sent when a specific RTR frame is received. Therefore the rx bcm_op uses bcm_can_tx() which uses the bcm_tx_lock that was only initialized in bcm_tx_setup(). Add the missing spin_lock_init() when allocating the bcm_op in bcm_rx_setup() to handle the RTR case properly.