Bug #111349
测试 Test-IT #110867: V1.0测试功能汇总
测试 Test-IT #110868: BSP-BVT测试-上/下电测试
【BSP】【EVT】【power】【低概率】执行adb reboot,低概率起不来
Status: | CLOSED | Start date: | 2022-09-15 | |
---|---|---|---|---|
Priority: | High | Due date: | ||
Assignee: | SZTS_TEST 邹涛 | % Done: | 100% | |
Category: | SYSTEM | |||
Target version: | VX1_MCE_FSE_V3.0_20221030 | |||
Need_Info: | -- | Found Version: | 0.0.0.20220818_alpha_004 | |
Resolution: | -- | Degrated: | -- | |
Severity: | Critical | Verified Version: | cdiot /Pre_figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220907/202209072155-1168 | |
Reproducibility: | Rarely | Fixed Version: | 2022-10-19 | |
Test Type: | Bring Up Test | Root cause: | https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/147583
drm_atomic_state_alloc函数中未对空指针做判断处理 |
Description
前置条件:
1、DUT各模块功能正常且处于开机状态
操作步骤:
1、cmd内执行adb reboot命令,重启过程中查看设备状态
实际结果:
1、板子低概率(1/12)起不来
预期结果:
Subtasks
Related issues
History
#1 Updated by CD SYSTEM-赵正军 almost 3 years ago
- Assignee changed from SZ 系统-张丽果 to CD SYSTEM-赵正军
- % Done changed from 0 to 50
从现有log中的发现,drm 图像显示逻辑存在空指针,同时本地也压测(循环重启50次出现一次)复现,已取ramdump,待进一步分析
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[14:42:38][ 32.178041] OF: graph: no port node found in /soc/qcom,dsi-display-primary
[14:42:38][ 32.185230] [drm] [msm-dsi-warn]: [hx82101+hx8692 video mode dsi truly panel] fallback to default te-pin-select
[14:42:38][ 32.195660] NPU_INFO: npu_reboot_handler: 760 Device is rebooting with code 1
[14:42:38][ 32.197156] msm_drm ae00000.qcom,mdss_mdp: Linked as a consumer to regulator.31
[14:42:38][ 32.210942] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[14:42:38][ 32.211016] register_client_adhoc:find path.src 1 dest 590
[14:42:38][ 32.212332] register_client_adhoc:Client handle 16 mdss_reg
[14:42:38][ 32.220007] Mem abort info:
[14:42:38][ 32.220014] ESR = 0x96000006
[14:42:38][ 32.220018] Exception class = DABT (current EL), IL = 32 bits
[14:42:38][ 32.220021] SET = 0, FnV = 0
[14:42:38][ 32.225816] register_client_adhoc:find path.src 22 dest 512
[14:42:38][ 32.227070] register_client_adhoc:find path.src 23 dest 512
[14:42:38][ 32.231438] EA = 0, S1PTW = 0
[14:42:38][ 32.235499] register_client_adhoc:Client handle 66 mdss_sde
[14:42:38][ 32.235925] [drm:sde_dbg_init:4676] evtlog_status: enable:11, panic:1, dump:2
[14:42:38][ 32.237484] Data abort info:
[14:42:38][ 32.237487] ISV = 0, ISS = 0x00000006
[14:42:38][ 32.237490] CM = 0, WnR = 0
[14:42:38][ 32.237495] user pgtable: 4k pages, 39-bit VAs, pgdp = ffffffdebef1d000
[14:42:38][ 32.237498] [0000000000000030] pgd=00000000ffc2e003, pud=00000000ffc2e003, pmd=0000000000000000
[14:42:38][ 32.243668] msm_drm ae00000.qcom,mdss_mdp: bound soc:qcom,wb-display@0 (ops sde_wb_comp_ops)
[14:42:38][ 32.246933] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[14:42:42][ 32.253011] msm-dsi-display soc:qcom,dsi-display-primary: Linked as a consumer to regulator.25
[14:42:42][ 32.258227] Modules linked in:
[14:42:42][ 32.258229] Process init (pid: 1, stack limit = 0xffffff8008058000)
[14:42:42][ 32.258231] CPU: 4 PID: 1 Comm: init Tainted: G S W 4.19.157+ #1
[14:42:42][ 32.258232] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b (DT)
[14:42:42][ 32.258233] pstate: 60400005 (nZCv daif +PAN -UAO)
[14:42:42][ 32.258239] pc : drm_atomic_state_alloc+0x18/0x80
[14:42:42][ 32.258245] lr : msm_lastclose+0x28c/0x458
[14:42:42][ 32.262005] [drm:dsi_display_bind] [msm-dsi-info]: Successfully bind display panel 'qcom,mdss_dsi_panel_hx82101_hx8692_truly_v2_video'
#2 Updated by CD SYSTEM-赵正军 almost 3 years ago
1,加入调试代码后单刷boot/dtbo分区,压力测试会概率出现镜像校验错误系统无法启动,
和security同事沟通,建议不要刷单个镜像测稳定性问题,起jenkins编译VB验证.
2,在本地完整编译版本.替换DB版本的xbl文件,这种情况能稳定出现reboot不能启动的问题,日志也一致.
3,同步提高通case,解析dump文件, 之前本地解析过一次,无法解析,但是版本没有问题,需要确认symbol文件是否打包正确.
#3 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File Serial-COM9_20220901230244.rar added
0.0.0.20220901_alpha_private_userdebug 版本验证adb reboot 23次后,板子无法起来,插拔DC后恢复;串口log已经上传
#4 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File Port_COM10.rar added
#5 Updated by CD SYSTEM-赵正军 almost 3 years ago
- File 2022-09-05-22-23-29.png added
目前高通定位是PCIE引起的问题,提供了一个修改方案如下,
经验证,问题还是存在
#6 Updated by CD SYSTEM-赵正军 almost 3 years ago
通过"fastboot oem select-display-panel none"的方式关闭连续显示
定位drm相关代码修改,移除提交#14061#140603的修改,版本同样存在问题
目前考虑从出问题之前的报错日志分析定位,从现有日志看分区选择也存在问题,日志如下
[ 35.197102] init: [libfs_mgr]Error updating for slotselect
[ 35.202770] init: [libfs_mgr]ReadFstabFromFile(): failed to load fstab from : '/fstab.qcom'
#7 Updated by CD SYSTEM-赵正军 almost 3 years ago
目前可以确定的是drivers/gpu/drm/drm_atomic.c drm_atomic_state_alloc函数里面
结构体指针config和其成员funcs存在NULL的情况.目前通过如下修改来处理指针为NULL的情况
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index cb3cc5a2d2ef..07332b7fda41 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -112,6 +112,10 @@ struct drm_atomic_state * drm_atomic_state_alloc(struct drm_device *dev) { struct drm_mode_config *config = &dev->mode_config; + if (!config || !config->funcs) { + printk("drmtest %d %s\n",__LINE__, __func__); + return NULL; + } if (!config->funcs->atomic_state_alloc) { struct drm_atomic_state *state;
目前在起VB验证
#8 Updated by CD SYSTEM-赵正军 almost 3 years ago
- Status changed from New to RESOLVED
- % Done changed from 50 to 100
- Verified Version set to /Pre_figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220907/202209072155-1168
使用VB版本测试4个小时,循环重启两百多次,没有空指针报错,提交如下
https://dev.thundercomm.com/gerrit/c/147583
#9 Updated by CD SYSTEM-赵正军 almost 3 years ago
- Assignee changed from CD SYSTEM-赵正军 to SZTS_TEST 邹涛
- Verified Version changed from /Pre_figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220907/202209072155-1168 to cdiot /Pre_figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220907/202209072155-1168
还请测试同事复测确认,谢谢
#10 Updated by CD SYSTEM-赵正军 almost 3 years ago
- File 572119275.jpg added
目前用如下版本测试345次,没有出现问题, 需要继续使用其他板子进一步确认
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220909/202209092232-1206
测试log见附件minicom.zip
#11 Updated by CD SYSTEM-赵正军 almost 3 years ago
- File minicom.zip added
#12 Updated by CD TPM-申艳艳 almost 3 years ago
- Category set to SYSTEM
#13 Updated by SZTS_TEST 邹涛 almost 3 years ago
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220909/202209092232-1206
使用修改gpio的硬件+上面的软件版本,执行重启脚本2H后设备进入900e
#14 Updated by CD SYSTEM-赵正军 almost 3 years ago
从测试飞书群里发的log Serial-COM119_20220914175848.log来看,没有出现问题一开始drm空指针以及slotselect相关的报错,出现的是以下报错,
[19:27:44][ 80.889803] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
[19:27:36][ 72.681214] cnss: fatal: Timeout waiting for FW ready indication
[19:27:44][ 80.876331] cnss: Timeout (40000ms) waiting for calibration to complete
[19:27:44][ 80.885980] cnss: ASSERT at line 2433
[19:27:44][ 80.889803] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
[19:27:44][ 80.896066] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[19:27:44][ 80.901703] Modules linked in: wlan(O+) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6_dlkm(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy
[19:27:44][ 80.946055] Process modprobe (pid: 1229, stack limit = 0xffffff8017768000)
[19:27:44][ 80.953111] CPU: 6 PID: 1229 Comm: modprobe Tainted: G S W O L 4.19.157+ #1
[19:27:44][ 80.961059] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b (DT)
[19:27:44][ 80.968469] pstate: 60400005 (nZCv daif +PAN -UAO)
[19:27:44][ 80.973394] pc : cnss_wlan_register_driver+0x2f0/0x2f8
[19:27:44][ 80.978670] lr : cnss_wlan_register_driver+0x2f0/0x2f8
[19:27:44][ 80.983945] sp : ffffff801776b9e0
我们的修改并没有改wireless相关的代码,还请测试同事另起bug给到LC团队
#15 Updated by CD SYSTEM-胡兵 almost 3 years ago
- Assignee changed from SZTS_TEST 邹涛 to CD SYSTEM-赵正军
- Target version set to 619
最新VB200次复测中已触发问题,目前规避方案生效,设备重启。相关问题已有单:https://share.thundersoft.com/redmine/issues/111881
V2.0 从解决root case
提交:
https://dev.thundercomm.com/gerrit/c/general/platform/system/core/+/150066
问题日志:
[14:51:00][ 23.029657] Run /init as init process
[14:51:00][ 23.038763] init: init first stage started!
[14:51:00][ 23.043355] init: Unable to open /lib/modules, skipping module loading.
[14:51:00][ 23.051362] init: [libfs_mgr]dt_fstab: Skip disabled entry for partition vendor
[14:51:00][ 23.058968] init: [libfs_mgr]ReadFstabFromDt(): failed to read fstab from dt
[14:51:00][ 23.066637] init: [libfs_mgr]dt_fstab: Skip disabled entry for partition vendor
[14:51:00][ 23.074445] init: [libfs_mgr]GetFstabPath()fstab_path:/odm/etc/fstab.default
[14:51:00][ 23.081808] init: [libfs_mgr]GetFstabPath()fstab_path:/vendor/etc/fstab.default
[14:51:00][ 23.089378] init: [libfs_mgr]GetFstabPath()fstab_path:/fstab.default
[14:51:00][ 23.096201] init: [libfs_mgr]GetFstabPath()fstab_path:/odm/etc/fstab.qcom
[14:51:00][ 23.103771] init: [libfs_mgr]GetFstabPath()fstab_path:/vendor/etc/fstab.qcom
[14:51:00][ 23.123476] OF: graph: no port node found in /soc/qcom,dsi-display-primary
[14:51:00][ 23.130639] [drm] [msm-dsi-warn]: [hx82101+hx8692 video mode dsi truly panel] fallback to default te-pin-select
[14:51:00][ 23.141065] NPU_INFO: npu_reboot_handler: 760 Device is rebooting with code 1
[14:51:00][ 23.142548] msm_drm ae00000.qcom,mdss_mdp: Linked as a consumer to regulator.31
[14:51:00][ 23.156369] drmtest 116 drm_atomic_state_alloc
[14:51:00][ 23.156492] register_client_adhoc:find path.src 1 dest 590
[14:51:00][ 23.157769] register_client_adhoc:Client handle 16 mdss_reg
[14:51:00][ 23.160997] last close failed: -12
[14:51:00][ 23.177464] register_client_adhoc:find path.src 22 dest 512
[14:51:00][ 23.178714] register_client_adhoc:find path.src 23 dest 512
[14:51:00][ 23.185645] register_client_adhoc:Client handle 66 mdss_sde
[14:51:00][ 23.192179] [drm:sde_dbg_init:4676] evtlog_status: enable:11, panic:1, dump:2
[14:51:00][ 23.200303] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x41b08
[14:51:00][ 23.205788] msm_drm ae00000.qcom,mdss_mdp: bound soc:qcom,wb-display@0 (ops sde_wb_comp_ops)
[14:51:00][ 23.217353] reboot: Restarting system with command ''
#16 Updated by 物联网项目组-RD3_CDTS 周飞 almost 3 years ago
- Target version changed from 619 to VC1_FSE_B sample_20221015
#17 Updated by SZTS_TEST 邹涛 almost 3 years ago
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220924/202209242316-1438 使用上面VB版本reboot 487次均正常,未复现到板子不能启动情况
#18 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File Serial-COM119_20220927111935.rar added
#19 Updated by CD SYSTEM-赵正军 almost 3 years ago
- Assignee changed from CD SYSTEM-赵正军 to SZTS_TEST 邹涛
还请测试同事随机跟踪几个DB版本测试 看是否有问题
#20 Updated by CDTS_TEST 王成 over 2 years ago
- Target version changed from VC1_FSE_B sample_20221015 to VX1_MCE_FSE_V2.0_update_20221012
10.12号交付版本与10.15号的是同一个版本
#21 Updated by CD SYSTEM-赵正军 over 2 years ago
Hi,邹涛
还请用最新的DB做一次压力测试,同时保存串口log。我们需要从log确认是否还有一开始报出的问题
谢谢!
#22 Updated by CDTS_TEST 刘勇 over 2 years ago
【未复现的串口LOG】使用 1011的UD版本,压测1000次的串口log链接: https://thundersoft.feishu.cn/file/boxcnj5Oko6DqubzgWqFmNFQnve
#23 Updated by SZTS_TEST 邹涛 over 2 years ago
- File 重启285次未见异常_串口日志.rar added
使用release分支1011号userdebug版本reboot验证285次,板子未发生异常
#24 Updated by CDTS_TEST 刘勇 over 2 years ago
【未复现】使用20221012的日构建UD版本压测655次未复现,串口LOG地址:https://thundersoft.feishu.cn/file/boxcnGCvCR7kUs20EIg4Po97qhd
#25 Updated by CDTS_TEST 王成 over 2 years ago
- Target version changed from VX1_MCE_FSE_V2.0_update_20221012 to VX1_MCE_FSE_V3.0_20221030
持续跟踪,以当前上千次的测试结果来分析,为极低概率
#26 Updated by SZTS_TEST 邹涛 over 2 years ago
- File Serial-COM78_20221015000000.rar added
EVT2板子使用1013号userdebug版本,验证reboot 2400多次后出现板子卡在900e
dump文件路径:https://thundersoft.feishu.cn/file/boxcnKx4yzOYPN8ioaAjRY5FYVb?from=from_copylink
#27 Updated by CD SYSTEM-赵正军 over 2 years ago
2560000 [22:04:08][ 78.828557] cnss: ASSERT at line 2433 2560001 [22:04:08][ 78.832364] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433! 2560002 [22:04:08][ 78.838610] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP 2560003 [22:04:08][ 78.844241] Modules linked in: wlan(O+) tsnv(O) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macr o_dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6 _dlkm(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy 2560004 [22:04:08][ 78.889300] Process modprobe (pid: 1206, stack limit = 0xffffff8016f50000) 2560005 [22:04:08][ 78.896360] CPU: 7 PID: 1206 Comm: modprobe Tainted: G S W O L 4.19.157+ #1 2560006 [22:04:08][ 78.904309] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b. VX1(MCE) EVT2 (DT) 2560007 [22:04:08][ 78.913052] pstate: 60400005 (nZCv daif +PAN -UAO) 2560008 [22:04:08][ 78.917984] pc : cnss_wlan_register_driver+0x2f0/0x2f8 2560009 [22:04:08][ 78.923255] lr : cnss_wlan_register_driver+0x2f0/0x2f8
Bug同#111971
#28 Updated by CD SYSTEM-赵正军 over 2 years ago
- Fixed Version set to 2022-10-19
- Root cause set to https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/147583 drm_atomic_state_alloc函数中未对空指针做判断处理
Hi,邹涛
reboot起不来的原因有多种,目前为止跟踪多个版本,从多次提供的log来看,此票一开始引起进900E的原因没在出现
现申请关闭此bug,其他出现900E的情况,还请另起bug跟踪,不然bug重复,一个bug用了两个票管理
谢谢
#30 Updated by SZTS_TEST 邹涛 over 2 years ago
- Status changed from VERIFIED to CLOSED
#31 Updated by CD FW-王伟 over 2 years ago
ID | Project | Branch | Uploader |
146958 | general/turbox/tools/build_script/platform | Pre_figure_turbox-c2130c-la1.1-vendor-dev | zhaozj0108@thundersoft.com |
SYSTEM:Add ELF file to meta package for ramdump analysis 1. Add ELF file to meta package TC-RID: 1200-0800301 IssueID: TS-R-BUG-111349 Change-Id: Icb1cf21989f56098cdff417200a284de7aab4f3b |
#32 Updated by CD FW-王伟 over 2 years ago
ID | Project | Branch | Uploader | ||
147583 | general/kernel/msm-4.19 | Pre_figure_turbox-c2130c-la1.1-vendor-dev | zhaozj0108@thundersoft.com | ||
SYSTEM:Prevent function parameters from passing in null pointers 1. if (!config |
!config->funcs) return NULL; TC-RID: 1200-0800301 IssueID: TS-R-BUG-111349 Change-Id: Ib2b53b7538dc0ae1b38f51a93300ae966acbbde9 |