Skip to content

Kernel panic on 9.2.11 #103

@maxpain

Description

@maxpain

I have kernel panic almost every day.
In rare cases, it happens on multiple nodes at the same time.
I use the DRBD 9.2.11 extension in Talos Linux v1.8.2 (Linux 6.6.58) on bare metal nodes (Dell R6615) with NVMe SSD.
I use Piraeus Operator on my Kubernetes cluster.

For the network, I use Broadcom 2x25G (50G in LACP bonding) with MTU 9000 (jumbo frame).
Linstor satellite pods are deployed with hostNetwork: true.

[40145.614353] general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
[40145.624361] CPU: 18 PID: 234918 Comm: conn48291 Tainted: G           O       6.6.58-talos #1
[40145.632800] Hardware name: Dell Inc. PowerEdge R6615/067N9T, BIOS 1.9.5 09/12/2024
[40145.640376] RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
[40145.645609] Code: 90 90 0f 1f 44 00 00 65 48 8b 04 25 80 e3 02 00 48 83 b8 30 0b 00 00 00 74 60 48 8b 80 30 0b 00 00 48 8b 50 30 48 85 d2 74 50 <80> 3a 55 b8 01 00 00 00 74 1b 48 8b 8f 88 00 00 00 48 83 f9 33 74
[40145.664366] RSP: 0018:ffffc900007c8bc8 EFLAGS: 00010082
[40145.669599] RAX: ffff88813eafb120 RBX: ffffc900007c8c20 RCX: 00007f116e206296
[40145.676740] RDX: 9e759c37ee555c76 RSI: 0000000000000001 RDI: ffffc90111fa3f58
[40145.683880] RBP: ffffc90111fa3f58 R08: 000000000002aee0 R09: 0000000000000008
[40145.691021] R10: ffffc90111fa0000 R11: ffffc900007c8ff8 R12: 0000000000000000
[40145.698162] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[40145.705303] FS:  00007f113e959700(0000) GS:ffff88defb500000(0000) knlGS:0000000000000000
[40145.713398] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40145.719155] CR2: 000015b40194c804 CR3: 0000000363b74003 CR4: 0000000000f70ee0
[40145.726294] PKRU: 55555554
[40145.729014] Call Trace:
[40145.731468]  <IRQ>
[40145.733502]  ? die_addr+0x36/0x90
[40145.736836]  ? exc_general_protection+0x217/0x420
[40145.741553]  ? asm_exc_general_protection+0x26/0x30
[40145.746450]  ? is_uprobe_at_func_entry+0x28/0x80
[40145.751083]  perf_callchain_user+0x20a/0x360
[40145.755365]  get_perf_callchain+0x147/0x1d0
[40145.759559]  bpf_get_stackid+0x60/0x90
[40145.763319]  bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
[40145.769333]  ? __smp_call_single_queue+0xad/0x120
[40145.774049]  bpf_overflow_handler+0x75/0x110
[40145.778330]  __perf_event_overflow+0x114/0x360
[40145.782787]  perf_swevent_hrtimer+0x134/0x150
[40145.787155]  ? __wake_up_common+0x73/0x180
[40145.791258]  ? timerqueue_del+0x2e/0x50
[40145.795107]  ? __pfx_perf_swevent_hrtimer+0x10/0x10
[40145.799996]  __hrtimer_run_queues+0x118/0x240
[40145.804365]  ? ktime_get_update_offsets_now+0x49/0x110
[40145.809511]  hrtimer_interrupt+0xf8/0x240
[40145.813531]  __sysvec_apic_timer_interrupt+0x4a/0xe0
[40145.818508]  sysvec_apic_timer_interrupt+0x6d/0x90
[40145.823310]  </IRQ>
[40145.825426]  <TASK>
[40145.827537]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[40145.832687] RIP: 0010:__kmem_cache_free+0x1cb/0x350
[40145.837576] Code: 48 85 db 0f 84 00 01 00 00 48 89 c2 48 0f ca 49 33 94 24 b8 00 00 00 48 89 10 49 8b 04 24 65 48 03 05 99 bd 37 61 48 8b 70 08 <4c> 39 68 10 0f 85 0b 01 00 00 48 8b 10 41 8b 44 24 28 48 01 d8 48
[40145.856331] RSP: 0018:ffffc90111fa3b70 EFLAGS: 00000282
[40145.861561] RAX: ffff88defb533910 RBX: ffff88813eafb120 RCX: ffffea0000000000
[40145.868698] RDX: 9e759c37ee555c76 RSI: 0000000000119862 RDI: ffff88810004e200
[40145.875836] RBP: ffffc90111fa3bc0 R08: 0000000000000086 R09: 00007f1153f9f9c0
[40145.882980] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810004e200
[40145.890120] R13: ffffea0004fabec0 R14: 0000000000000000 R15: 0000000000000000
[40145.897266]  ? uprobe_free_utask+0x62/0x80
[40145.901378]  ? acct_collect+0x4c/0x220
[40145.905141]  uprobe_free_utask+0x62/0x80
[40145.909075]  mm_release+0x12/0xb0
[40145.912401]  do_exit+0x26b/0xaa0
[40145.915643]  __x64_sys_exit+0x1b/0x20
[40145.919317]  do_syscall_64+0x5a/0x80
[40145.922911]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[40145.927976] RIP: 0033:0x7f116e206296
[40145.931565] Code: 28 06 00 00 0f 84 ec 01 00 00 48 8b 44 24 08 f6 80 08 03 00 00 40 0f 85 7a 01 00 00 ba 3c 00 00 00 0f 1f 00 31 ff 89 d0 0f 05 <eb> f8 48 89 c8 48 c7 00 00 00 00 00 48 8d 48 f8 48 39 d0 75 ed 48
[40145.950321] RSP: 002b:00007f113e958a40 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
[40145.957891] RAX: ffffffffffffffda RBX: 00007f113e859000 RCX: 00007f116e206296
[40145.965033] RDX: 000000000000003c RSI: 00007f1153f9f9c0 RDI: 0000000000000000
[40145.972177] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000056b90006
[40145.979317] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1149f8925e
[40145.986456] R13: 00007f1149f8925f R14: 00007f113e959700 R15: 00007f113e958b00
[40145.993606]  </TASK>
[40145.995808] Modules linked in: drbd_transport_tcp(O) drbd(O) ahci i40e sp5100_tco bnxt_en amd64_edac megaraid_sas libahci nvme k10temp watchdog
[40146.008673] ---[ end trace 0000000000000000 ]---
[40146.013298] RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
[40146.018531] Code: 90 90 0f 1f 44 00 00 65 48 8b 04 25 80 e3 02 00 48 83 b8 30 0b 00 00 00 74 60 48 8b 80 30 0b 00 00 48 8b 50 30 48 85 d2 74 50 <80> 3a 55 b8 01 00 00 00 74 1b 48 8b 8f 88 00 00 00 48 83 f9 33 74
[40146.037290] RSP: 0018:ffffc900007c8bc8 EFLAGS: 00010082
[40146.042521] RAX: ffff88813eafb120 RBX: ffffc900007c8c20 RCX: 00007f116e206296
[40146.049662] RDX: 9e759c37ee555c76 RSI: 0000000000000001 RDI: ffffc90111fa3f58
[40146.056805] RBP: ffffc90111fa3f58 R08: 000000000002aee0 R09: 0000000000000008
[40146.063946] R10: ffffc90111fa0000 R11: ffffc900007c8ff8 R12: 0000000000000000
[40146.071088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[40146.078227] FS:  00007f113e959700(0000) GS:ffff88defb500000(0000) knlGS:0000000000000000
[40146.086321] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40146.092077] CR2: 000015b40194c804 CR3: 0000000363b74003 CR4: 0000000000f70ee0
[40146.099222] PKRU: 55555554
[40146.101943] Kernel panic - not syncing: Fatal exception in interrupt
[40146.108739] Kernel Offset: disabled
[40146.112246] Rebooting in 10 seconds..

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions