Bug #112149

测试 Test-IT #110867: V1.0测试功能汇总

测试 Test-IT #110868: BSP-BVT测试-上/下电测试

【BSP】【EVT】【power】【低概率】MCU循环升级228次后,板子未正常启动,进入dump

Added by SZTS_TEST 邹涛 almost 3 years ago. Updated over 2 years ago.

Status:CLOSEDStart date:2022-09-26
Priority:HighDue date:2023-01-18
Assignee:移动测试一组_CDTS 刘强% Done:

60%

Category:LC
Target version:VX1_MCE_FSE_V5.0_20221230
Need_Info:-- Found Version:0924号release分支userdebug版本
Resolution:WONTFIX Degrated:--
Severity:Critical Verified Version:
Reproducibility:Occasionally Fixed Version:
Test Type:Bring Up Test Root cause:

Description

前置条件:
1、DUT各模块功能正常且处于开机状态

操作步骤:
1、执行MCU循环升级脚本,升级完成后查看板子和屏幕状态

实际结果:
1、MCU循环升级228次后,板子未正常启动,端口显示90E5,电流大小为348ma
2、adb不可以正常使用,板子进入dump模式

预期结果:
1、MCU每次升级,板子和屏幕状态均正常

Serial-COM119_20220924195645.rar (4.86 MB) SZTS_TEST 邹涛, 2022-09-26 17:29

Serial-COM78_20221017133324.rar (779 KB) SZTS_TEST 邹涛, 2022-10-18 10:45

Serial-COM78_20221022000000.rar (18.3 MB) SZTS_TEST 邹涛, 2022-10-24 09:59


Related issues

Related to Figure - Bug #111971: 【VC1】【BSP】【EVT】【power】【低概率】脚本循环执行reboot,2H后出现板子进900e,无法正常... CLOSED 2022-09-15 2023-03-10

History

#1 Updated by CD SYSTEM-赵正军 almost 3 years ago

  • Status changed from New to ASSIGNED
  • Assignee changed from CD SYSTEM-赵正军 to CD LC 陶洪普
386745 [04:39:40][  124.922091] [crtc_c][0x97b91ac5][00:14:09.220365] wlan: [782:E:QDF] BEFORE_UNLOCK: lock held too long (151720us)
386746 [04:39:40][  124.932526] QDF BUG in qdf_spin_unlock_irqrestore Line 438: Failed assertion '0'
386747 [04:39:40][  124.940144] ------------[ cut here ]------------
386748 [04:39:40][  124.944888] kernel BUG at ../../vendor/qcom/opensource/wlan/qcacld-3.0/.qca6390/./qca-wifi-host-cmn/qdf/linux/src/qdf_trace.c:3987!
386749 [04:39:40][  124.957022] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
386750 [04:39:40][  124.962661] Modules linked in: wlan(O) tsnv(O) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_       dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6_dl       km(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy
386751 [04:39:40][  125.007644] Process crtc_commit:125 (pid: 782, stack limit = 0xffffff800df98000)
386752 [04:39:40][  125.015242] CPU: 5 PID: 782 Comm: crtc_commit:125 Tainted: G S      W  O      4.19.157+ #1
386753 [04:39:40][  125.023724] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b. VX1(MCE) EVT1 (DT)
386754 [04:39:40][  125.032468] pstate: 60c00085 (nZCv daIf +PAN +UAO)
386755 [04:39:40][  125.037499] pc : __qdf_bug+0xc/0x10 [wlan]
386756 [04:39:40][  125.041793] lr : wlan_hdd_pm_qos_notify+0x300/0x440 [wlan]

如上日志wifi存在问题,还请LC进一步分析

#2 Updated by 物联网项目组-RD3_CDTS 周飞 almost 3 years ago

  • Target version changed from VX1_MCE_FSE_V2.0_20220930 to VX1_MCE_FSE_V2.0_update_20221012

#3 Updated by CD LC 陶洪普 over 2 years ago

#4 Updated by CDTS_TEST 王成 over 2 years ago

  • Target version changed from VX1_MCE_FSE_V2.0_update_20221012 to VX1_MCE_FSE_V3.0_20221030

已提CASE给高通、依赖高通解决,低概率BUG,先延期至10.30

#5 Updated by CD LC 陶洪普 over 2 years ago

高通要求符号表和复现的DUMP.继续复现中.

#6 Updated by SZTS_TEST 邹涛 over 2 years ago

使用release分支0930号userdebug版本验证mcu循环升级178次未复现;后续使用evt2板子烧录最新版本再次验证

#7 Updated by SZTS_TEST 邹涛 over 2 years ago

使用EVT2板子,执行reboot脚本436次后,板子进入dump,屏幕背光不亮
dump文件路径:https://thundersoft.feishu.cn/file/boxcn7K6kReQaSRwiTFvhvxs1De

#9 Updated by SZTS_TEST 邹涛 over 2 years ago

release分支1014号userdebug版本,验证MCU升级936次后,板子未正常启动,进入dump

#10 Updated by CDTS-TEST 周婷 over 2 years ago

  • Due date set to 2022-10-26

10/26 关于此bug 需要有一个结论出来

#11 Updated by CDTS-TEST 周婷 over 2 years ago

  • Target version changed from VX1_MCE_FSE_V3.0_20221030 to VX1_MCE_FSE_V3.0_update_20221115

#12 Updated by CD LC 陶洪普 over 2 years ago

更新:
测试多次(每次400次重启)未复现.继续测试.

#13 Updated by CDTS-TEST 周婷 over 2 years ago

  • Due date changed from 2022-10-26 to 2022-11-18
  • Target version changed from VX1_MCE_FSE_V3.0_update_20221115 to VX1_MCE_FSE_V4.0_20221128

#14 Updated by CD LC 陶洪普 over 2 years ago

  • Due date changed from 2022-11-18 to 2022-11-22

#16 Updated by CD LC 陶洪普 over 2 years ago

  • Target version changed from VX1_MCE_FSE_V4.0_20221128 to MCE_V4.0update_20221215(原VC1_FSE_Vdec1219)

进入900E,是因为通过PCIE传送FW和Calibration文件到wifi DSP中超时失败。
触发的Assert流程。
目前在取消触发Assert机制来复现&确认 “出现这个问题时PCIe的连接具体有哪些问题?”

#17 Updated by CD LC 陶洪普 over 2 years ago

  • Due date changed from 2022-11-22 to 2022-11-30

更新:
[22:04:02][ 72.676116] cnss: fatal: Timeout waiting for FW ready indication
[22:04:08][ 78.820258] cnss: Timeout (40000ms) waiting for calibration to complete
[22:04:08][ 78.828557] cnss: ASSERT at line 2433
[22:04:08][ 78.832364] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
直接原因是没有接收到 QMI_WLFW_FW_MEM_READY_IND_V01 Ready信息触发。 {
.type = QMI_INDICATION,
.msg_id = QMI_WLFW_FW_MEM_READY_IND_V01,
.ei = wlfw_fw_mem_ready_ind_msg_v01_ei,
.decoded_size = sizeof(struct wlfw_fw_mem_ready_ind_msg_v01),
.fn = cnss_wlfw_fw_mem_ready_ind_cb
},
提交了Debug patch
https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/165655
去掉了发生这个问题,进入Dump和异常,预备在出现问题的状态下 对比PCI 的状态 和 WLAN FW侧日志。

#18 Updated by CD LC 陶洪普 over 2 years ago

更新:
与112149 上所更新,加上Debug patch 复现中, 测试432次未复现,继续测试。
-----
[22:04:02][ 72.676116] cnss: fatal: Timeout waiting for FW ready indication
[22:04:08][ 78.820258] cnss: Timeout (40000ms) waiting for calibration to complete
[22:04:08][ 78.828557] cnss: ASSERT at line 2433
[22:04:08][ 78.832364] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
直接原因是没有接收到 QMI_WLFW_FW_MEM_READY_IND_V01 Ready信息触发。 {
.type = QMI_INDICATION,
.msg_id = QMI_WLFW_FW_MEM_READY_IND_V01,
.ei = wlfw_fw_mem_ready_ind_msg_v01_ei,
.decoded_size = sizeof(struct wlfw_fw_mem_ready_ind_msg_v01),
.fn = cnss_wlfw_fw_mem_ready_ind_cb
},
提交了Debug patch
https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/165655
去掉了发生这个问题,进入Dump和异常,预备在出现问题的状态下 对比PCI 的状态 和 WLAN FW侧日志。

#19 Updated by CD LC 陶洪普 over 2 years ago

继续复现中。 已执行757次。

#20 Updated by CD LC 陶洪普 over 2 years ago

更新:
测试1233次重启测试,继续复测。(与之前不同点,每次重启之后判断wlan正常加载后再重启)

#21 Updated by CD LC 陶洪普 over 2 years ago

更新:
再次测试515次未复现(每次启动后,判断wifi驱动加载后,重启.过程大约1分钟)
今天如果不能复现,恢复原测试方法(设备启动过程中,重启)

#22 Updated by CD LC 陶洪普 over 2 years ago

UPdate:
在更新测试流程(每次启动后判断WiFi驱动加载了之后重启(大约每次启动一分钟后重启))没有复现出该票中故障。
但复现了另外一个pci读写异常问题,分析日志中。

#23 Updated by CD LC 陶洪普 over 2 years ago

在每次判断设备正常加载驱动wlan之后, 目前未浮现该票中现象.
(出现的了一次其他原因的异常)
现在恢复之前的测试方法来测试.

#24 Updated by CD LC 陶洪普 over 2 years ago

  • Assignee changed from CD LC 陶洪普 to CDTS_TEST 刘勇

Hi 刘勇,

出现校准文件导入超时是因为FW没有发送准备好的Message. 已要求高通来确认FW 日志。
另外,该问题在重复启动测试1600多次之后出现了一次,复现困难。
请测试帮忙复现。抓取wlan_logs,串口和logcat.

请帮忙复现测试,抓取上面日志,谢谢!

#25 Updated by CDTS-TEST 周婷 over 2 years ago

  • Assignee changed from CDTS_TEST 刘勇 to CD LC 陶洪普

如需复现请走need info 状态谢谢!

#26 Updated by CD LC 陶洪普 over 2 years ago

  • Status changed from ASSIGNED to NEED_INFO
  • Assignee changed from CD LC 陶洪普 to CDTS_TEST 刘勇

Hi 刘勇,

出现校准文件导入超时是因为FW没有发送准备好的Message. 已要求高通来确认FW 日志。
另外,该问题在重复启动测试1600多次之后出现了一次,复现困难。
请测试帮忙复现。抓取wlan_logs,串口和logcat.

请帮忙复现测试,抓取上面日志,谢谢!

#27 Updated by CDTS-TEST 周婷 over 2 years ago

  • Category changed from SYSTEM to LC

#28 Updated by CDTS-TEST 周婷 over 2 years ago

  • Target version changed from MCE_V4.0update_20221215(原VC1_FSE_Vdec1219) to VX1_MCE_FSE_V5.0_20221230

#29 Updated by CD TPM-王祥林 over 2 years ago

  • Due date changed from 2022-11-30 to 2023-01-18

#30 Updated by CDTS_TEST 王成 over 2 years ago

  • Assignee changed from CDTS_TEST 刘勇 to 移动测试一组_CDTS 刘强

找陶洪普和BSP的人确认下,该BUG是否需要再次压测

#31 Updated by 移动测试一组_CDTS 刘强 over 2 years ago

目前压测:102次未复现该问题

步骤:
0068 版本升级(MCU0.0.7)->0070版本(0.0.9)正常升级
0070版本(0.0.9)->0068 版本升级(MCU0.0.7)降级完成后会擦除数据

#32 Updated by CD LC 陶洪普 over 2 years ago

  • Status changed from NEED_INFO to RESOLVED
  • % Done changed from 0 to 60

Hi 刘强,

请参照#111971上面分析,使用user DB版本,非root做此项测试.
https://share.thundersoft.com/redmine/issues/111971
谢谢!

#33 Updated by CD LC 陶洪普 over 2 years ago

  • Resolution changed from -- to WONTFIX

#34 Updated by CD LC 陶洪普 over 2 years ago

更新:
在问题报在userdebug版本,分析Ethernet驱动lan743x影响的原因,延时加载wifi驱动解决这个问题.
但在user版本上,未复现该问题.如果合入了延时加载wifi驱动,会导致tc-service打开wifi失败(因为延时后,tc-service打开wifi时,wifi驱动还未加载完成)
综上,"延时加载wifi驱动"不合入.
请在user,非root版本上测试.

#35 Updated by 移动测试一组_CDTS 刘强 over 2 years ago

  • Status changed from RESOLVED to VERIFIED

验证结果:https://thundersoft.feishu.cn/sheets/shtcn5ImqRt1IhgQPZiHG2kCNxe?sheet=NxPTfb
根据MCU异常压测结果,未出现异常,因此先关闭此问题

#36 Updated by 移动测试一组_CDTS 刘强 over 2 years ago

  • Status changed from VERIFIED to CLOSED

Also available in: Atom PDF