Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 上传 系统镜像 内核 panic 导致系统重启 #21366

Open
huhaiqwer opened this issue Oct 9, 2024 · 2 comments
Open

[BUG] 上传 系统镜像 内核 panic 导致系统重启 #21366

huhaiqwer opened this issue Oct 9, 2024 · 2 comments
Labels
bug Something isn't working state/awaiting processing

Comments

@huhaiqwer
Copy link

问题描述/What happened:

上传 系统镜像拉起虚拟机,宿主机会发生重启,内核 panic 日志如下

[ 760.046684] BUG: kernel NULL pointer dereference, address: 0000000000000178
[ 778.433195] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
[ 778.433196] Modules linked in: act_police cls_basic sch_ingress vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_multiport ipt_rpfilter iptable_raw ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel openvswitch nf_conncount vhost_net vhost vhost_iotlb tap tun xt_addrtype xt_set ip_set_hash_ipportnet ip_set_bitmap_port ip_set_hash_ipportip ip_set_hash_ipport dummy nf_tables ip_set ip6table_mangle iptable_mangle ip6table_filter ip6table_nat ip6_tables xt_MASQUERADE xt_conntrack xt_comment xt_mark xt_nat iptable_filter iptable_nat ip_tables veth nf_conntrack_netlink nfnetlink rfkill overlay ip_vs_ftp nf_nat ip_vs_sed ip_vs_nq ip_vs_fo ip_vs_sh ip_vs_dh ip_vs_lblcr ip_vs_lblc ip_vs_wrr ip_vs_rr ip_vs_wlc ip_vs_lc ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c sunrpc dm_snapshot dm_bufio intel_powerclamp mgag200 coretemp joydev kvm_intel i2c_algo_bit drm_kms_helper kvm syscopyarea sysfillrect sysimgblt irqbypass ses fb_sys_fops cec enclosure ipmi_ssif dcdbas
[ 778.433237] scsi_transport_sas pcspkr sg iTCO_wdt iTCO_vendor_support acpi_power_meter ipmi_si intel_cstate gpio_ich intel_uncore ipmi_devintf i7core_edac ipmi_msghandler lpc_ich acpi_cpufreq br_netfilter bridge stp llc drm fuse ext4 mbcache jbd2 sr_mod cdrom sd_mod t10_pi ata_generic crct10dif_pclmul crc32_pclmul ata_piix crc32c_intel libata megaraid_sas ghash_clmulni_intel serio_raw bnx2 wmi dm_mirror dm_region_hash dm_log dm_mod
[ 778.433257] CPU: 3 PID: 32884 Comm: etcd Kdump: loaded Tainted: G S I 5.10.0-182.0.0.95.oe2203sp3.x86_64 #1
[ 778.433258] Hardware name: Dell Inc. PowerEdge R610/08GXHX, BIOS 6.3.0 07/24/2012
[ 778.433259] RIP: 0010:native_queued_spin_lock_slowpath+0x179/0x1c0
[ 778.433261] Code: eb eb c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 6c 03 00 48 03 04 f5 20 5b a1 94 48 89 10 8b 42 08 85 c0 75 09 90 8b 42 08 85 c0 74 f7 48 8b 32 48 85 f6 74 97 0f 18 0e eb 92
[ 778.433261] RSP: 0018:ffffa51ae1d13a10 EFLAGS: 00000046
[ 778.433262] RAX: 0000000000000000 RBX: ffff979e0fab5d40 RCX: 0000000000100000
[ 778.433263] RDX: ffff979e0f8b6c00 RSI: 0000000000000015 RDI: ffff979e0fab5d40
[ 778.433264] RBP: ffff979e0fab5d40 R08: 0000000000000003 R09: 000000000000000b
[ 778.433265] R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000046
[ 778.433265] R13: ffffa51ae1d13c48 R14: 000000000000000b R15: 0000000000000003
[ 778.433266] FS: 000000c00008c090(0000) GS:ffff979e0f880000(0000) knlGS:0000000000000000
[ 778.433267] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 778.433268] CR2: 000000c00190b120 CR3: 0000000d92ab0004 CR4: 00000000000226e0
[ 778.433268] Call Trace:
[ 778.433269]
[ 778.433269] ? watchdog_hardlockup_check.part.0.cold+0x21/0x73
[ 778.433270] ? __perf_event_overflow+0x52/0x100
[ 778.433270] ? handle_pmi_common+0x218/0x2d0
[ 778.433271] ? set_pte_vaddr_p4d+0x3f/0x50
[ 778.433272] ? flush_tlb_one_kernel+0xa/0x20
[ 778.433272] ? native_set_fixmap+0x4f/0x70
[ 778.433273] ? ghes_copy_tofrom_phys+0x74/0x120
[ 778.433274] ? __ghes_peek_estatus.isra.0+0x49/0xb0
[ 778.433274] ? intel_pmu_handle_irq+0xcb/0x1c0
[ 778.433275] ? perf_event_nmi_handler+0x28/0x50
[ 778.433275] ? nmi_handle+0x58/0x100
[ 778.433276] ? default_do_nmi+0x42/0x140
[ 778.433277] ? exc_nmi+0x122/0x160
[ 778.433277] ? end_repeat_nmi+0x16/0x67
[ 778.433278] ? native_queued_spin_lock_slowpath+0x179/0x1c0
[ 778.433279] ? native_queued_spin_lock_slowpath+0x179/0x1c0
[ 778.433280] ? native_queued_spin_lock_slowpath+0x179/0x1c0
[ 778.433281]
[ 778.433281] _raw_spin_lock+0x1e/0x30
[ 778.433282] raw_spin_rq_lock_nested+0xa/0x10
[ 778.433283] update_blocked_averages+0x44/0x120
[ 778.433283] update_nohz_stats+0x40/0x60
[ 778.433284] find_busiest_group+0x287/0xa70
[ 778.433285] load_balance+0x15b/0x6f0
[ 778.433285] newidle_balance+0x154/0x2f0
[ 778.433286] pick_next_task_fair+0x351/0xb10
[ 778.433286] pick_next_task+0x34/0x120
[ 778.433287] __schedule+0x1a1/0x670
[ 778.433287] schedule+0x46/0xb0
[ 778.433288] do_nanosleep+0x71/0x190
[ 778.433288] hrtimer_nanosleep+0x9b/0x140
[ 778.433289] ? hrtimer_init_sleeper+0x80/0x80
[ 778.433290] __se_sys_nanosleep+0xab/0xe0
[ 778.433290] do_syscall_64+0x40/0x80
[ 778.433291] entry_SYSCALL_64_after_hwframe+0x62/0xc7
[ 778.433291] RIP: 0033:0x45e42d
[ 778.433293] Code: 8b 44 24 20 b9 40 42 0f 00 f7 f1 48 89 04 24 b8 e8 03 00 00 f7 e2 48 89 44 24 08 48 89 e7 be 00 00 00 00 b8 23 00 00 00 0f 05 <48> 8b 6c 24 10 48 83 c4 18 c3 cc cc cc cc cc cc cc cc cc b8 ba 00
[ 778.433293] RSP: 002b:000000c00009bf00 EFLAGS: 00000206 ORIG_RAX: 0000000000000023
[ 778.433295] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045e42d
[ 778.433296] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000c00009bf00
[ 778.433296] RBP: 000000c00009bf10 R08: 0000000000000000 R09: 0000000000000000
[ 778.433297] R10: 00007fff129bd080 R11: 0000000000000206 R12: 0000000000431e10
[ 778.433298] R13: 0000000000000011 R14: 000000000123bfc8 R15: 0000000000000000
[ 778.433299] Kernel panic - not syncing: Hard LOCKUP 这个问题咋解决啊?

环境/Environment:

  • OS : openEuler 22.03 SP#

  • Kernel : 5.10.0-230.0.0.132.oe2203sp3.x86_64

  • Host:
    image

  • Version : v3,10,8

@wanyaoqi
Copy link
Member

[ 760.046684] BUG: kernel NULL pointer dereference, address: 0000000000000178

#21306 (comment)
image
这种内核的问题可以开 kdump 调试一下,看看能不能找到对应的 bugfix,要么换个内核版本试试

@huhaiqwer
Copy link
Author

@wanyaoqi 你们官方用的是欧拉22.03 sp3 的什么内核版本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working state/awaiting processing
Projects
None yet
Development

No branches or pull requests

2 participants