From 3a9b8741925700c7cab81d06b3522e8ab9f740e1 Mon Sep 17 00:00:00 2001 From: Brendan Cunningham Date: Thu, 7 Sep 2023 15:30:52 -0400 Subject: [PATCH] nvidia_p2p_get_pages(): Fix double-free in register-callback error path Double-free in rm_p2p_register_callback() error-path in nv_p2p_get_pages() causes memory corruption that leads to a kernel panic. Fix this by adding a separate goto for this error path that skips freeing the already-freed memory. Double-free can be produced by calling nvidia_p2p_get_pages() on one CPU while simultaneously freeing the GPU virtual address range passed into nvidia_p2p_get_pages() on another CPU. Producing the double-free is timing dependent and may require multiple tries. 'slub_debug=FZ' kernel boot parameter shows the double-free: [ 239.115091] ============================================================================= [ 239.124659] BUG kmalloc-16 (Tainted: G OE ): Object already free [ 239.133011] ----------------------------------------------------------------------------- [ 239.144491] Slab 0xfffffa8bc4434140 objects=85 used=82 fp=0xffff9a3dd0d05910 flags=0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) [ 239.158997] Object 0xffff9a3dd0d05670 @offset=1648 fp=0x0000000000000000 [ 239.168766] Redzone ffff9a3dd0d05660: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [ 239.179633] Object ffff9a3dd0d05670: 10 00 00 00 00 00 00 00 e5 04 3f 13 96 18 8e 47 ..........?....G [ 239.190641] Redzone ffff9a3dd0d05680: bb bb bb bb bb bb bb bb ........ [ 239.200739] Padding ffff9a3dd0d05688: 84 80 0e 00 00 00 00 00 ........ [ 239.210938] CPU: 0 PID: 3150 Comm: hfi-sdma-test Kdump: loaded Tainted: G OE 6.5.0-rc1+ #1 [ 239.221911] Hardware name: Intel Corporation S2600CWR/S2600CWR, BIOS SE5C610.86B.01.01.1029.090220201031 09/02/2020 [ 239.233948] Call Trace: [ 239.236992] [ 239.239608] dump_stack_lvl+0x33/0x50 [ 239.244010] object_err+0x3a/0x80 [ 239.248014] free_debug_processing+0x265/0x360 [ 239.253392] ? nv_p2p_get_pages+0x163/0x590 [nvidia] [ 239.259399] free_to_partial_list+0x80/0x280 [ 239.264478] ? nv_p2p_get_pages+0x163/0x590 [nvidia] [ 239.270426] nv_p2p_get_pages+0x163/0x590 [nvidia] [ 239.276303] ? __pfx_remove_nvidia_pages+0x10/0x10 [hfi1] [ 239.282692] nvidia_p2p_get_pages+0x25/0x40 [nvidia] [ 239.288601] ? __pfx_remove_nvidia_pages+0x10/0x10 [hfi1] ... [ 239.498990] [ 239.501662] Disabling lock debugging due to kernel taint [ 239.507828] FIX kmalloc-16: Object at 0xffff9a3dd0d05670 not freed Signed-off-by: Brendan Cunningham --- kernel-open/nvidia/nv-p2p.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel-open/nvidia/nv-p2p.c b/kernel-open/nvidia/nv-p2p.c index e6fa5c5e8..2060efb2b 100644 --- a/kernel-open/nvidia/nv-p2p.c +++ b/kernel-open/nvidia/nv-p2p.c @@ -518,7 +518,7 @@ static int nv_p2p_get_pages( *page_table, nv_p2p_mem_info_free_callback, mem_info); if (status != NV_OK) { - goto failed; + goto failed_nofree; } } @@ -542,6 +542,7 @@ failed: os_free_mem(rreqmb_h); } +failed_nofree: if (bGetPages) { (void)nv_p2p_put_pages(pt_type, sp, p2p_token, va_space,