Bug #111971
测试 Test-IT #110867: V1.0测试功能汇总
测试 Test-IT #110868: BSP-BVT测试-上/下电测试
【VC1】【BSP】【EVT】【power】【低概率】脚本循环执行reboot,2H后出现板子进900e,无法正常起来,屏幕不亮
Status: | CLOSED | Start date: | 2022-09-15 | ||
---|---|---|---|---|---|
Priority: | High | Due date: | 2023-03-10 | ||
Assignee: | 移动测试一组_CDTS 刘强 | % Done: | 80% | ||
Category: | BSP | ||||
Target version: | VC1_FSE_0082_20230314 | ||||
Need_Info: | -- | Found Version: | VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220909/202209092232-1206 | ||
Resolution: | FIXED | Degrated: | -- | ||
Severity: | Critical | Verified Version: | |||
Reproducibility: | Occasionally | Fixed Version: | |||
Test Type: | Bring Up Test | Root cause: | VC1 TP中断异常,开机一直触发终端,会导致系统相应不及时,引起crash |
Description
前置条件:
1、DUT各模块功能正常且处于开机状态
操作步骤:
1、脚本循环执行reboot,查看板子和屏幕状态
实际结果:
1、脚本运行2H后,板子进入900e不能正常启动,屏幕黑屏
预期结果:
1、持续运行reboot脚本,板子不会进入900e等其他异常状态
Related issues
History
#1 Updated by SZTS_TEST 邹涛 almost 3 years ago
#2 Updated by CD SYSTEM-赵正军 almost 3 years ago
- Assignee changed from CD SYSTEM-赵正军 to CD LC 陶洪普
这个问题是基于bug#111349修复版本,做压力测试测出的问题
从 Serial-COM119_20220914175848.log来看,没有出现问题一开始drm空指针以及slotselect相关的报错,出现的是以下报错,
[19:27:44][ 80.889803] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
[19:27:36][ 72.681214] cnss: fatal: Timeout waiting for FW ready indication
[19:27:44][ 80.876331] cnss: Timeout (40000ms) waiting for calibration to complete
[19:27:44][ 80.885980] cnss: ASSERT at line 2433
[19:27:44][ 80.889803] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
[19:27:44][ 80.896066] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[19:27:44][ 80.901703] Modules linked in: wlan(O+) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6_dlkm(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy
[19:27:44][ 80.946055] Process modprobe (pid: 1229, stack limit = 0xffffff8017768000)
[19:27:44][ 80.953111] CPU: 6 PID: 1229 Comm: modprobe Tainted: G S W O L 4.19.157+ #1
[19:27:44][ 80.961059] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b (DT)
[19:27:44][ 80.968469] pstate: 60400005 (nZCv daif +PAN -UAO)
[19:27:44][ 80.973394] pc : cnss_wlan_register_driver+0x2f0/0x2f8
[19:27:44][ 80.978670] lr : cnss_wlan_register_driver+0x2f0/0x2f8
[19:27:44][ 80.983945] sp : ffffff801776b9e0
但是我们的修改并没有改wireless相关的代码,还请LC团队分析跟进
#3 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File Serial-COM119_20220915202308.log added
第二次进900e,dump文件:https://thundersoft.feishu.cn/file/boxcnZltO33U5m75HmJAlyOi58b
概率1/22
#4 Updated by CD LC 陶洪普 almost 3 years ago
- Status changed from New to NEED_INFO
- Assignee changed from CD LC 陶洪普 to SZTS_TEST 邹涛
HI,
请提供一下设置反复重启的脚本文件。谢谢!
需要确认是否是在设备ready下执行的重启。
#5 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File mem.bat added
#6 Updated by SZTS_TEST 邹涛 almost 3 years ago
- Assignee changed from SZTS_TEST 邹涛 to CD LC 陶洪普
#7 Updated by CDTS_TEST 王成 almost 3 years ago
- Status changed from NEED_INFO to ASSIGNED
#8 Updated by SZTS_TEST 邹涛 almost 3 years ago
- Target version set to VX1_MCE_FSE_V2.0_20220930
#9 Updated by SZTS_TEST 邹涛 almost 3 years ago
- Assignee changed from CD LC 陶洪普 to CD LC-徐玺栋
#10 Updated by SZTS_TEST 邹涛 almost 3 years ago
- File Serial-COM119_20220926195358.rar added
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20220924/202209242316-1438 使用上面版本进行mcu循环升级368次后板子进入900e,屏幕黑屏,电流大小354ma,adb不可用
dump文件地址:https://thundersoft.feishu.cn/file/boxcnMcQ7VmCkhF2jn3qj9sUQlg
#11 Updated by 物联网项目组-RD3_CDTS 周飞 almost 3 years ago
- Target version changed from VX1_MCE_FSE_V2.0_20220930 to VX1_MCE_FSE_V2.0_update_20221012
#12 Updated by CD LC 陶洪普 almost 3 years ago
- Assignee changed from CD LC-徐玺栋 to CD LC 陶洪普
#13 Updated by CD LC 陶洪普 over 2 years ago
- Status changed from ASSIGNED to NEED_INFO
- Assignee changed from CD LC 陶洪普 to CD SYSTEM-赵正军
Hi 正军,
这个问题是基于bug#111349修复版本,做压力测试测出的问题
请问当时111349做了哪些改动?
111349
现在111349最新修改后,复测未出现异常现象, 该case中描述的现象实在VB调试111349中发生的, 实际的DB版本中并未出现该现象, 是否可以这么理解?
从直接原因看是在通过PCIE接口传送校准文件到DSP中,出现等待calibration to complete超时.
[19:27:36][ 72.681214] cnss: fatal: Timeout waiting for FW ready indication
[19:27:44][ 80.876331] cnss: Timeout (40000ms) waiting for calibration to complete
[19:27:44][ 80.885980] cnss: ASSERT at line 2433
在111349修复后,复测中看到未发现这个现象, 请问是否可以这么理解?
BR!
#14 Updated by CD SYSTEM-赵正军 over 2 years ago
- Assignee changed from CD SYSTEM-赵正军 to CD LC 陶洪普
Hi,洪普
BUG#111349的修改有两笔
https://dev.thundercomm.com/gerrit/c/general/platform/system/core/+/150066
https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/147583
DB是否有出现该问题可以和测试再确认一下,我这边收到的消息是对应VB的反馈
BUG#111971问题有可能是修改引出的问题,但有可能LC本身也存在问题。
从两笔提交来看,修改引出的可能性不大,所以才起了此BUG
如果本身存在LC的问题的话,不处理#111349的bug,#111971是测不出来的,因为在重启过程中会被#111349优先拦截
#111349的修改已经入库,可以和测试确认之后的DB是否有做#111971的压力测试
BR!
#15 Updated by CD LC 陶洪普 over 2 years ago
现在直接原因是等待校准文件传送到DSP,等待超时. 创建了高通case跟踪, FW侧闭源.
------------------
[19:27:36][ 72.681214] cnss: fatal: Timeout waiting for FW ready indication
[19:27:44][ 80.876331] cnss: Timeout (40000ms) waiting for calibration to complete
[19:27:44][ 80.885980] cnss: ASSERT at line 2433
[19:27:44][ 80.889803] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
[19:27:44][ 80.896066] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
------------------
目前高通回复需要dump日志和符号表, 继续复现获取这些数据中.
#16 Updated by CDTS_TEST 王成 over 2 years ago
- Target version changed from VX1_MCE_FSE_V2.0_update_20221012 to VX1_MCE_FSE_V3.0_20221030
已提高通CASE,依赖高通解决
#17 Updated by SZTS_TEST 邹涛 over 2 years ago
使用release分支1011号userdebug版本,执行多次reboot后,板子进入900e;
log文件:https://thundersoft.feishu.cn/file/boxcnVxqnnkw6M3lGdHisXYSwXn
#18 Updated by SZTS_TEST 邹涛 over 2 years ago
- File Serial-COM78_20221015000000.rar added
EVT2板子使用1013号userdebug版本,验证reboot 2400多次后出现板子卡在900e
dump文件路径:https://thundersoft.feishu.cn/file/boxcnKx4yzOYPN8ioaAjRY5FYVb?from=from_copylink
#19 Updated by SZTS_TEST 邹涛 over 2 years ago
- Status changed from NEED_INFO to ASSIGNED
#20 Updated by CD LC 陶洪普 over 2 years ago
确认更新的日志.
#21 Updated by CD LC 陶洪普 over 2 years ago
使用release分支1011号userdebug版本,执行多次reboot后,板子进入900e;
log文件:https://thundersoft.feishu.cn/file/boxcnVxqnnkw6M3lGdHisXYSwXn
确认符号表解析Dump中.
#22 Updated by CD LC 陶洪普 over 2 years ago
缺少META文件,DUmp解析不了.重新编译了带META的版本.
#23 Updated by CDTS_TEST 王成 over 2 years ago
- Target version changed from VX1_MCE_FSE_V3.0_20221030 to VX1_MCE_FSE_V3.0_update_20221115
更换基本后再验证
#24 Updated by CD LC 陶洪普 over 2 years ago
20221101测试情况:重启830次, 未复现
版本: ftp://dvbuild@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20221030/202210300041-2035
(虽是VB版本,无patch, 只勾选了META,和Enter BP-CODE编译)
测试条件: 裸板, 风扇散热
其他测试条件无变化.
#25 Updated by CD LC 陶洪普 over 2 years ago
- Due date set to 2022-11-22
编译了带META的版本,压力测试复现中.内部设置Due 11/22 优先处理.
#26 Updated by CDTS-TEST 周婷 over 2 years ago
- Target version changed from VX1_MCE_FSE_V3.0_update_20221115 to VX1_MCE_FSE_V4.0_20221128
#27 Updated by CD LC 陶洪普 over 2 years ago
更新:
ftp://dvbuild@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20221117/202211171905-2465
使用上面编译的VB版本(只多加了编译META)
重复启动测试,出现
Slot _a is unbootable, trying alternate slot
导致设备进入fastboot. 不能继续压测.
附件(测试结果):VB-带META吧编译-没有额外patch.zip
换ftp://cdiot@192.168.87.46/Pre_figure/ReleaseBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/NULL/20221117/0110 测试
重启200次还未出现异常. 继续测试中.
#28 Updated by CD LC 陶洪普 over 2 years ago
- File VB-带META吧编译-没有额外patch.zip added
#29 Updated by CD LC 陶洪普 over 2 years ago
- Target version changed from VX1_MCE_FSE_V4.0_20221128 to MCE_V4.0update_20221215(原VC1_FSE_Vdec1219)
进入900E,是因为通过PCIE传送FW和Calibration文件到wifi DSP中超时失败。
触发的Assert流程。
目前在取消触发Assert机制来复现&确认 “出现这个问题时PCIe的连接具体有哪些问题?”
#30 Updated by CD LC 陶洪普 over 2 years ago
[22:04:02][ 72.676116] cnss: fatal: Timeout waiting for FW ready indication
[22:04:08][ 78.820258] cnss: Timeout (40000ms) waiting for calibration to complete
[22:04:08][ 78.828557] cnss: ASSERT at line 2433
[22:04:08][ 78.832364] kernel BUG at drivers/net/wireless/cnss2/pci.c:2433!
直接原因是没有接收到 QMI_WLFW_FW_MEM_READY_IND_V01 Ready信息触发。 {
.type = QMI_INDICATION,
.msg_id = QMI_WLFW_FW_MEM_READY_IND_V01,
.ei = wlfw_fw_mem_ready_ind_msg_v01_ei,
.decoded_size = sizeof(struct wlfw_fw_mem_ready_ind_msg_v01),
.fn = cnss_wlfw_fw_mem_ready_ind_cb
},
提交了Debug patch
https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/165655
去掉了发生这个问题,进入Dump和异常,预备在出现问题的状态下 对比PCI 的状态 和 WLAN FW侧日志。
截止现在 重启测试663次,未复现。
#31 Updated by CD LC 陶洪普 over 2 years ago
继续复现中。 已执行757次。(中间意外usb连线断开一次)
#32 Updated by CD LC 陶洪普 over 2 years ago
更新:
测试1233次重启测试,继续复测。
#33 Updated by CD LC 陶洪普 over 2 years ago
更新:
再次测试515次未复现(每次启动后,判断wifi驱动加载后,重启.过程大约1分钟)
今天如果不能复现,恢复原测试方法(设备启动过程中,重启)
#34 Updated by CD LC 陶洪普 over 2 years ago
UPdate:
在更新测试流程(每次启动后判断WiFi驱动加载了之后重启(大约每次启动一分钟后重启))没有复现出该票中故障。
但复现了另外一个pci读写异常问题,分析日志中。
#35 Updated by CD LC 陶洪普 over 2 years ago
在每次判断设备正常加载驱动wlan之后, 目前未浮现该票中现象.
(出现的了一次其他原因的异常)
现在恢复之前的测试方法来测试.
#36 Updated by CD LC 陶洪普 over 2 years ago
During the restart pressure test (repeatedly restarting the device), the following exceptions occur in probability.
------------------------------------------------------
[ 71.148931] [MAX975X:max96755g_write_bytes] ERROR i2c master send 0x8E err
[ 72.672569] cnss: fatal: Timeout waiting for FW ready indication
[ 72.960938] i2c_geni 980000.i2c: i2c error :-107
[ 72.965750] [MAX975X:max96755g_write_bytes] ERROR i2c master send 0x8E err
[ 74.784942] i2c_geni 980000.i2c: i2c error :-107
[ 74.797016] [MAX975X:max96755g_write_bytes] ERROR i2c master send 0x8E err
[ 76.608920] i2c_geni 980000.i2c: i2c error :-107
[ 76.621023] [MAX975X:max96755g_write_bytes] ERROR i2c master send 0x8E err
[ 78.432936] i2c_geni 980000.i2c: i2c error :-107
[ 78.445054] [MAX975X:max96755g_write_bytes] ERROR i2c master send 0x8E err
[ 78.816122] cnss: Timeout (40000ms) waiting for calibration to complete
If Timeout (40000ms) waiting for calibration to complete, the device will trigger an active exception.
Before this, “cnss: fatal: Timeout waiting for FW ready indication” also occurs.
For FW ready, since it is closed source, have asked Qualcomm to help check.
#37 Updated by CD LC 陶洪普 over 2 years ago
出现校准文件导入超时是因为FW没有发送准备好的Message. 已要求高通来确认FW 日志。
另外,该问题在重复启动测试1600多次之后出现了一次,复现困难。
请测试帮忙复现。抓取wlan_logs,串口和logcat.
#38 Updated by CDTS-TEST 周婷 over 2 years ago
如需依赖测试 请正确流转bug 状态,谢谢
#39 Updated by CD LC 陶洪普 over 2 years ago
- File 20221213-logs.zip added
更新:
复现了一次,传送20221213-logs.zip给高通,同步分析中.
#40 Updated by CDTS-TEST 周婷 over 2 years ago
- Target version changed from MCE_V4.0update_20221215(原VC1_FSE_Vdec1219) to VX1_MCE_FSE_V5.0_20221230
#41 Updated by CD LC 陶洪普 over 2 years ago
使用本地编译的版本,包含下面的elf文件,复现中. 复现后,根据elf文件解析dump.
LINUX/android/out/target/product/figure/obj/kernel/msm-4.19/vmlinux
LINUX/android/out/target/product/figure/obj/vendor/qcom/opensource/wlan/qcacld-3.0/.qca6490/qca_cld3_qca6490.ko
LINUX/android/out/target/product/figure/obj/vendor/qcom/opensource/wlan/qcacld-3.0/.qca6390/qca_cld3_qca6390.ko
wlan_proc/core/rombin/v1/core_rom_v1_link_patched/build/6490.wlanfw.eval/CORE_ROM_V1_LINK_PATCHED_6490.wlanfw.evalQ_link.elf
wlan_proc/config/bsp/cnss_ram_v1_TO_link_patched/build/6490.wlanfw.eval_v1_TO_xr/CNSS_RAM_V1_TO_LINK_PATCHED_6490.wlanfw.eval_v1_TO_xrQ_link.elf
wlan_proc/config/bsp/cnss_ram_v1_TO_link_patched/build/6490.wlanfw.eval_v1_TO_xr/Data.msc
wlan_proc/scripts/myps/qshrink/msg_hash.txt
wlan_proc/wlan/subsys/phyucode_binary/hsp/pcss_m3_tx_rx.axf
#42 Updated by CD LC 陶洪普 over 2 years ago
FW没有正确传输完成,创建了另外一个case 06411661 , 请高通BSP一起分析.
FW download is stuck
0x0000000045E42FE8 | 61.071999: mhi_sm_set_trig_pending: 8
0x0000000045E43030 | 61.072003: mhi_core_isr: MHI_SYS_ERR triggered by Host
on host side, tasklet is pending on CPU0, will cloned a case to BSP team to check further
-----------irq_stat -------------
core 0: __softirq_pending = 0x50 ['BLOCK_SOFTIRQ', 'TASKLET_SOFTIRQ']
core 1: __softirq_pending = 0x0 []
core 2: __softirq_pending = 0x0 []
core 3: __softirq_pending = 0x0 []
core 4: __softirq_pending = 0x0 []
core 5: __softirq_pending = 0x0 []
core 6: __softirq_pending = 0x0 []
core 7: __softirq_pending = 0x0 []
#43 Updated by CD LC 陶洪普 over 2 years ago
分析:
. the process/thread running on core 0
Task name PID Exec_Started_at Last_Queued_at Total_wait_time No_of_times_exec Prio State Last_enqueued_ts Last_sleep_ts Last runtime(msec)
ifconfig 1363 39.468297463 0.000000000 0.074291511 28 120 R 39.448432848 39.413550035 0.000000000
2.
3f95308 [79.280795] [1650923141] : LOGK_READL from address ffffff800c3f2790(40300790) called from addr ffffff95360d7ab8 lan743x_intr_entry_isr[jt] Line 711 of \"/home/xuxidong/code/Figure/turbox_c2130c_vendor/LINUX/android/kernel/msm-4.19/include/asm-generic/io.h\"
3f95310 [79.280797] [1650923192] : LOGK_WRITEL from address ffffff800c3f2798(40300798) called from addr ffffff95360d7b78 lan743x_intr_entry_isr[jt] Line 745 of \"/home/xuxidong/code/Figure/turbox_c2130c_vendor/LINUX/android/kernel/msm-4.19/include/asm-generic/io.h\"
3f95318 [79.280798] [1650923205] : LOGK_READL from address ffffff800c3f2780(40300780) called from addr ffffff95360d7594 lan743x_intr_shared_isr[jt] Line 711 of \"/home/xuxidong/code/Figure/turbox_c2130c_vendor/LINUX/android/kernel/msm-4.19/include/asm-generic/io.h\"
3f95320 [79.280800] [1650923255] : LOGK_WRITEL from address ffffff800c3f2780(40300780) called from addr ffffff95360d75e0 lan743x_intr_shared_isr[jt] Line 745 of \"/home/xuxidong/code/Figure/turbox_c2130c_vendor/LINUX/android/kernel/msm-4.19/include/asm-generic/io.h\"
3f95328 [79.280801] [1650923268] : LOGK_WRITEL from address ffffff800c3f2794(40300794) called from addr ffffff95360d7d18 lan743x_intr_entry_isr[jt] Line 745 of \"/home/xuxidong/code/Figure/turbox_c2130c_vendor/LINUX/android/kernel/msm-4.19/include/asm-generic/io.h\"
-----------
#44 Updated by CD LC 陶洪普 over 2 years ago
更新:
Same as before, core0 is busy processing lan743x interrupts, can you disable this module then test again
lan743x是Ethernet的驱动,从直接原因看是这个占用了core0,阻塞了wifi驱动进程.
- 在尝试暂时移除lan743x来确认是否这个因素影响的
- 同时找测试找带Ethernet设备来复测.
#45 Updated by 物联网项目组-RD3_CDTS 周飞 over 2 years ago
- Target version changed from VX1_MCE_FSE_V5.0_20221230 to VX1_MCE_FSE_C sample_20230113
1月10号代码入库 @陶洪普
#46 Updated by CD LC 陶洪普 over 2 years ago
更新:
在使用了 https://dev.thundercomm.com/gerrit/c/general/clo/la/kernel/msm-4.19/+/171865 patch基础上
修改点, 注释了LAN743X Ethernet驱动(注释后,没有加载Eth驱动)
#CONFIG_MARVELL_88Q2112=y
#CONFIG_GENERIC_PHY=y
#CONFIG_PHYLIB=y
#CONFIG_LAN743X=y
-----------------------
测试两块板子, 测试了54小时,无发生进入900e异常.
----
板 4160580E 20221223 13:00开始测试,到1227 22:00 共106H
板 20AA3BFE 20221223 13:00开始测试,到1225 20:00 共 55H(中间被拔掉了电源)
由此, wifi驱动与Ethernet阻塞core0有一定关联.
#47 Updated by CDTS_TEST 刘勇 over 2 years ago
使用EVT2-VX1整机联通以太网盒子以及路由,使用FlatBuild_HH_VX1_MCE_FSE.M.D.userdebug.01.00.X101.202212231033版本,测试11个小时,没有进入900e模式
#48 Updated by CD LC 陶洪普 over 2 years ago
更新:
尝试修改了延长等待 QMI_WLFW_FW_READY_IND_V01 消息的定时器时间.(恢复了Ethernet的驱动),在不连接以太网卡的板子上,复测中.
http://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/173203
#49 Updated by CD LC 陶洪普 over 2 years ago
将wlan 驱动加载时间在原有boot阶段,延迟5秒开始.提交patch
https://dev.thundercomm.com/gerrit/c/general/platform/vendor/qcom/kona/+/175730
https://dev.thundercomm.com/gerrit/c/general/device/qcom/sepolicy_vndr/+/175812
压测中
#50 Updated by CDTS-TEST 周婷 over 2 years ago
- Target version deleted (
VX1_MCE_FSE_C sample_20230113)
压测中无法赶上今日版本入库,最晚11/22 需有结论
#51 Updated by CD TPM-王祥林 over 2 years ago
- Due date changed from 2022-11-22 to 2023-01-18
#52 Updated by 物联网测试组_CDTS 段小刚 over 2 years ago
验证:版本——FlatBuild_HH_VX1_MCE_FSE.M.D.user.01.00.X101.202301112125
设备——VX1
连接以太网-路由,使用reboot脚本执行1961次未复现
Log: 对应tlog太大, 无法上传
#53 Updated by CD LC 陶洪普 over 2 years ago
--Previous comment------------------------------------------
修改点, 注释了LAN743X Ethernet驱动(注释后,没有加载Eth驱动)
#CONFIG_MARVELL_88Q2112=y
#CONFIG_GENERIC_PHY=y
#CONFIG_PHYLIB=y
#CONFIG_LAN743X=y
-----------------------
测试两块板子, 测试了54小时,无发生进入900e异常.
----
板 4160580E 20221223 13:00开始测试,到1227 22:00 共106H
------------------------------------------------------------
在异常时,Core0中lan743占用
[22:04:10][ 80.323638] CPU: 0 PID: 1326 Comm: ifconfig Tainted: G S D W O L 4.19.157+ #1
[22:04:10][ 80.331586] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b. VX1 EVT2 (DT)
[22:04:10][ 80.340341] Call trace:
[22:04:10][ 80.342863] dump_backtrace+0x0/0x258
[22:04:10][ 80.346627] show_stack+0x20/0x30
[22:04:10][ 80.350035] dump_stack+0xd8/0x12c
[22:04:10][ 80.353532] ipi_cpu_stop+0xac/0xc0
[22:04:10][ 80.357119] ipi_cpu_stop+0x0/0xc0
[22:04:10][ 80.360615] gic_handle_irq+0x124/0x1d8
[22:04:10][ 80.364559] el1_irq+0xec/0x198
[22:04:10][ 80.367797] __do_softirq+0x118/0x554
[22:04:10][ 80.371566] irq_exit+0xfc/0x120
[22:04:10][ 80.374888] __handle_domain_irq+0xa8/0xf8
[22:04:10][ 80.379101] gic_handle_irq+0x160/0x1d8
[22:04:10][ 80.383045] el1_irq+0xec/0x198
[22:04:10][ 80.386286] ktime_get+0x84/0x110
[22:04:10][ 80.389695] usleep_range+0x30/0x90
[22:04:10][ 80.393285] lan743x_netdev_open+0x714/0x2040
[22:04:10][ 80.397768] __dev_open+0xcc/0x190
[22:04:10][ 80.401267] __dev_change_flags+0x9c/0x1c0
在user版本上,使用延时wifi 驱动加载5秒方法,后导致tcservice 启动wifi失败(tcservice在service阶段启动,wifi驱动还未完全加载)
01-11 19:44:46.719 1520 2118 I WifiService: setWifiEnabled package=com.thundercomm.tcservice uid=1000 enable=true
01-11 19:44:46.720 1520 1845 E WifiSelfRecovery: Triggering recovery for reason: WifiNative Failure
01-11 19:44:46.720 1520 1845 E WifiSelfRecovery: Restarting wifi for reason: WifiNative Failure
01-11 19:44:46.721 1520 1845 I WifiNative: Vendor HAL died. Cleaning up internal state.
01-11 19:44:47.113 1520 1591 I uc_send_user_broadcast: [0,android.intent.action.LOCKED_BOOT_COMPLETED]
01-11 19:44:47.216 2682 2682 I AlarmClock: AlarmInitReceiver android.intent.action.LOCKED_BOOT_COMPLETED
01-11 19:44:47.284 1520 1609 I ActivityManager: Posting BOOT_COMPLETED user #0
01-11 19:44:47.293 1520 1587 I uc_send_user_broadcast: [0,android.intent.action.BOOT_COMPLETED]
01-11 19:44:47.563 2584 2584 I QCCAuthMgr: android.intent.action.LOCKED_BOOT_COMPLETED intent received
01-11 19:44:47.727 1520 1845 E WifiNative: Failed to start vendor HAL
01-11 19:44:47.728 1520 1845 E WifiNative: Failed to start Hal
01-11 19:44:47.735 1520 1845 E WifiSelfRecovery: Triggering recovery for reason: WifiNative Failure
01-11 19:44:47.735 1520 1845 E WifiSelfRecovery: Restarting wifi for reason: WifiNative Failure
01-11 19:44:47.782 1520 1845 I WifiNative: Vendor HAL died. Cleaning up internal state.
01-11 19:44:47.790 1520 1845 E WifiSelfRecovery: Triggering recovery for reason: WifiNative Failure
在DB版本中,lan743x准时进入Ready状态"Link is Down"没有阻塞WiFi接收QMI消息现象.
01-01 17:53:22.493 0 0 I libphy : lan743x-mdiobus-ext: probed
01-01 17:53:28.534 0 0 D cnss : Calibration done is 0
01-01 17:53:29.107 0 0 E wlan : Loading driver v5.2.022.10G; 2023-02-05T02:53:06Z; cld:; cmn:;
01-01 17:53:29.120 0 0 D cnss : Start to wait for calibration to complete
01-01 17:53:29.271 0 0 I lan743x 0001: 01:00.0 eth0: using MSIX interrupts, number of vectors = 6
01-01 17:53:29.291 0 0 I lan743x 0001: 01:00.0 eth0: Link is Down
01-01 17:53:28.462 0 0 I lan743x 0001: 01:00.0 eth0: successfully registered ptp clock
01-01 17:53:32.148 0 0 D cnss : Received QMI WLFW calibration done indication
01-01 17:53:32.148 0 0 D cnss : Calibration completed successfully
----
重启测试已测试 1806 次, 版本 ftp://cdiot@192.168.87.46/Pre_figure/DailyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/user/20230205/0014/FlatBuild_HH_VX1_MCE_FSE.M.D.user.01.00.X101.202302050304.zip
因此,延时不合入.使用user DB版本,非root做此项测试.
#54 Updated by CD LC 陶洪普 over 2 years ago
- Status changed from ASSIGNED to RESOLVED
- Assignee changed from CD LC 陶洪普 to 移动测试一组_CDTS 刘强
- % Done changed from 0 to 60
Hi 刘强,
请参照上面分析,使用user DB版本,非root做此项测试.
谢谢!
#55 Updated by CD LC 陶洪普 over 2 years ago
- Resolution changed from -- to WONTFIX
#56 Updated by CD LC 陶洪普 over 2 years ago
更新:
在问题报在userdebug版本,分析Ethernet驱动lan743x影响的原因,延时加载wifi驱动解决这个问题.
但在user版本上,未复现该问题.如果合入了延时加载wifi驱动,会导致tc-service打开wifi失败(因为延时后,tc-service打开wifi时,wifi驱动还未加载完成)
综上,"延时加载wifi驱动"不合入.
请在user,非root版本上测试.
#57 Updated by 移动测试一组_CDTS 刘强 over 2 years ago
- Status changed from RESOLVED to VERIFY_FAILED
- Assignee changed from 移动测试一组_CDTS 刘强 to CD SYSTEM-胡兵
验证版本:ftp://cdiot@192.168.87.46/Pre_figure/ReleaseBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/user/20230223/0945/FlatBuild_HH_VX1_MCE_FSE.M.R.user.01.00.0077.X101.zip
验证结果:Fail
验证硬件:VC V02
执行循环重启第991次后,设备进入900E
Dump与tlog日志保存在FTP:ftp://cdiot@192.168.87.46/Figure/bug/111971
#58 Updated by CDTS-TEST 周婷 over 2 years ago
蒋富雄对应
#59 Updated by CDTS-TEST 周婷 over 2 years ago
- Due date changed from 2023-01-18 to 2023-03-10
- Assignee changed from CD SYSTEM-胡兵 to System组-RD3_CDTS 蒋富雄
#60 Updated by CDTS-TEST 周婷 over 2 years ago
- Category changed from LC to SYSTEM
#61 Updated by CD TPM-王祥林 over 2 years ago
- Status changed from VERIFY_FAILED to ASSIGNED
- Resolution deleted (
WONTFIX)
富雄
洪普反馈新抓的日志和之前分析的wifi驱动加载错误不一样,属于其他问题。请帮忙跟进一下,具体情况和洪普沟通一下。
#62 Updated by CD TPM-王祥林 over 2 years ago
- Target version set to VC1_FSE_0082_20230314
#63 Updated by System组-RD3_CDTS 蒋富雄 over 2 years ago
从log来看空指针了,需要TP处理
[ 1.623522] Info Block: 1f 00 00 68 b3 22 0a 01 00 6c c9 22 4a 00 01 6d
[ 1.623523] Info Block: 14 23 10 00 01 6f 25 23 1f 02 00 70 85 23 04 01
[ 1.623525] Info Block: 01 71 8f 23 06 01 00 81 9d 23 05 00 01 85 a3 23
[ 1.623527] Info Block: 29 01 00 b2 e0 fb
[ 1.623539] atmel_mxt_ts 0-004a: Read Info Block success
[ 1.623546] atmel_mxt_ts 0-004a: Registering input device
[ 1.623822] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 1.623829] Mem abort info:
[ 1.623833] ESR = 0x96000045
[ 1.623841] Exception class = DABT (current EL), IL = 32 bits
[ 1.623846] SET = 0, FnV = 0
[ 1.623850] EA = 0, S1PTW = 0
[ 1.623855] Data abort info:
[ 1.623859] ISV = 0, ISS = 0x00000045
[ 1.623864] CM = 0, WnR = 1
[ 1.623870] [0000000000000000] user address but active_mm is swapper
[ 1.623879] Internal error: Oops: 96000045 [#1] PREEMPT SMP
[ 1.623887] Modules linked in:
[ 1.623896] Process swapper/0 (pid: 0, stack limit = 0x00000000d2850ef5)
[ 1.623906] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G S 4.19.157-perf+ #1
[ 1.623911] Hardware name: Qualcomm Technologies, Inc. kona MTP-RmPM8150b. VC1 EVT2 (DT)
[ 1.623918] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[ 1.623930] pc : geni_i2c_irq+0x3dc/0x528
[ 1.623941] lr : __handle_irq_event_percpu+0x100/0x2b8
[ 1.623945] sp : ffffff8008003e30
[ 1.623949] x29: ffffff8008003e30 x28: ffffffaa3d7b9b88
[ 1.623956] x27: ffffffaa3d7fa000 x26: ffffffaa3d683000
[ 1.623963] x25: ffffffaa3d60db38 x24: 0000000008000081
[ 1.623968] x23: 0000000000000000 x22: 0000000008000081
[ 1.623974] x21: 0000000000000000 x20: 0000000000000000
[ 1.623980] x19: ffffffe7167a0080 x18: 0000000000000000
[ 1.623987] x17: 0000000000000000 x16: 0000000000000001
[ 1.623993] x15: 0000000000000000 x14: 0000000000000000
[ 1.624000] x13: 0000000000000000 x12: 0000000000000000
[ 1.624005] x11: 0000000000000000 x10: 0000000000000000
[ 1.624011] x9 : 0000000000000001 x8 : ffffff8015493d58
[ 1.624017] x7 : 000000000465f058 x6 : ffffffe7167a66a2
[ 1.624022] x5 : 0000000000000040 x4 : ffffffe7158f3b00
[ 1.624028] x3 : ffffffaa3d603d60 x2 : 0000000000000001
[ 1.624034] x1 : ffffffe7167a0080 x0 : 0000000000000113
[ 1.624042] Call trace:
[ 1.624049] geni_i2c_irq+0x3dc/0x528
[ 1.624055] __handle_irq_event_percpu+0x100/0x2b8
[ 1.624060] handle_irq_event+0x5c/0xd0
[ 1.624067] handle_fasteoi_irq+0x108/0x200
[ 1.624073] __handle_domain_irq+0x7c/0xc0
[ 1.624083] gic_handle_irq+0x14c/0x1c4
[ 1.624090] el1_irq+0xe8/0x190
[ 1.624095] arch_cpu_idle+0xec/0x1e0
[ 1.624105] do_idle+0x104/0x290
[ 1.624110] cpu_startup_entry+0x20/0x28
[ 1.624120] rest_init+0xd0/0xe0
[ 1.624133] start_kernel+0x4ec/0x574
#64 Updated by System组-RD3_CDTS 蒋富雄 over 2 years ago
- File dmesg_TZ.txt
added
#65 Updated by CD BSP-杜磊 over 2 years ago
补充信息, 正常Atmel tp log如下:
[ 6.518383] atmel_mxt_ts 0-004a: ATMEL MaXTouch Driver version 4.12c
[ 6.525019] atmel_mxt_ts 0-004a: Using gpiod to irq mapping(383) to overrite the client's irq(0)
[ 6.534053] atmel_mxt_ts 0-004a: Mxt Reset (Flag: 0C)
[ 6.539248] atmel_mxt_ts 0-004a: Resetting chip(H)
[ 6.544192] atmel_mxt_ts 0-004a: <Seqnum>: 0 -> 0
[ 6.556694] atmel_mxt_ts 0-004a: Family: 164 Variant: 56 Firmware V2.0.AA Objects: 42
[ 6.575443] Info Block: a4 38 20 aa 29 47 2a 75 06 01 f2 17 00 25 ce 17
[ 6.575444] Info Block: 81 00 00 2c 50 18 00 00 00 05 51 18 0a 00 00 06
[ 6.575444] Info Block: 5c 18 06 00 01 44 63 18 48 00 01 26 ac 18 3f 00
[ 6.575445] Info Block: 00 0e ec 18 20 01 00 47 2e 19 f9 00 00 6e 28 1a
[ 6.575445] Info Block: 8d 05 00 76 7c 1d 02 00 01 8d 7f 1d 08 28 00 91
[ 6.575446] Info Block: f0 1e 00 00 00 07 f1 1e 06 00 00 08 f8 1e 0e 00
[ 6.575446] Info Block: 00 0f 07 1f 0a 01 01 12 1d 1f 01 00 00 13 1f 1f
[ 6.575446] Info Block: 0f 00 01 18 2f 1f 12 01 04 19 55 1f 26 00 01 1b
[ 6.575447] Info Block: 7c 1f 06 01 01 28 8a 1f 06 01 00 2a 98 1f 0d 01
[ 6.575447] Info Block: 00 2e b4 1f 17 00 01 38 cc 1f 2c 00 01 3d f9 1f
[ 6.575448] Info Block: 04 05 01 41 17 20 16 02 01 46 5c 20 09 18 01 48
[ 6.575448] Info Block: 56 21 58 00 01 4e af 21 0b 01 00 4f c7 21 03 05
[ 6.575449] Info Block: 00 50 df 21 0d 01 01 64 fb 21 4b 01 12 65 93 22
[ 6.575449] Info Block: 1f 00 00 68 b3 22 0a 01 00 6c c9 22 4a 00 01 6d
[ 6.575450] Info Block: 14 23 10 00 01 6f 25 23 1f 02 00 70 85 23 04 01
[ 6.575450] Info Block: 01 71 8f 23 06 01 00 81 9d 23 05 00 01 85 a3 23
[ 6.575451] Info Block: 29 01 00 b2 e0 fb
[ 6.575456] atmel_mxt_ts 0-004a: Read Info Block success
[ 6.576394] atmel_mxt_ts 0-004a: T6 Config Checksum: 0x1B6EAC
[ 6.580925] atmel_mxt_ts 0-004a: Registering input device
[ 6.597164] atmel_mxt_ts 0-004a: peak enalbe, check curve compensation
[ 6.604335] atmel_mxt_ts 0-004a: curve data: magic: 17238; flag: 1;levels: 7
[ 6.611589] atmel_mxt_ts 0-004a: curve coordinate compensation at X
[ 6.618034] atmel_mxt_ts 0-004a: curve coordinate compensation levels is: 7
[ 6.627690] atmel_mxt_ts 0-004a: curve compensation enabled
[ 6.633423] atmel_mxt_ts 0-004a: Touchscreen size {x,y} = {2559,1079}
[ 6.653515] atmel_mxt_ts 0-004a: peak enalbe, check curve compensation
[ 6.660897] atmel_mxt_ts 0-004a: curve data: magic: 17238; flag: 1;levels: 7
[ 6.668138] atmel_mxt_ts 0-004a: curve coordinate compensation at X
[ 6.674582] atmel_mxt_ts 0-004a: curve coordinate compensation levels is: 7
[ 6.683856] atmel_mxt_ts 0-004a: curve compensation enabled
[ 6.689588] atmel_mxt_ts 0-004a: Touchscreen size {x,y} = {2559,1079}
[ 6.706645] atmel_mxt_ts 0-004a: Falling back to syfs fallback for: maxtouch.cfg
[ 6.714279] atmel_mxt_ts 0-004a: mxt_probe: probe successfully!
[ 11.456703] atmel_mxt_ts 0-004a: T25 Status 0x 0 Info: 00 00 00 00 00
[ 11.481455] atmel_mxt_ts 0-004a: Config file CRC 0x1B6EAC same as device CRC: No update required.
[ 11.490582] atmel_mxt_ts 0-004a: Skip update config file
[ 11.496669] atmel_mxt_ts 0-004a: Initialized power cfg: ACTV 10, IDLE 20
[ 11.503561] atmel_mxt_ts 0-004a: Get Irqflags 0x0, will check Retrigen mode
[ 11.511607] atmel_mxt_ts 0-004a: RETRIGEN Enabled feature
[ 21.248569] atmel_mxt_ts 0-004a: FB event:2,blank:0
[ 21.248572] atmel_mxt_ts 0-004a: suspend: event = 2, blank = 0, not care
[ 23.356007] atmel_mxt_ts 0-004a: FB event:1,blank:0
[ 23.356008] atmel_mxt_ts 0-004a: drm_notifier_callback: UNBLANK! Resume by drm notifier.
#66 Updated by CD BSP-杜磊 over 2 years ago
- File kernel_log.txt
added
#67 Updated by CD BSP-杜磊 over 2 years ago
- Category changed from SYSTEM to BSP
- Assignee changed from System组-RD3_CDTS 蒋富雄 to BSP组-RD3_TSCD 贾顺旺
#68 Updated by CD BSP-杜磊 over 2 years ago
- File 副本VC1_Display_Info _V1.0 - FSE.xlsx added
#69 Updated by CD BSP-杜磊 over 2 years ago
- Subject changed from 【VX1】【BSP】【EVT】【power】【低概率】脚本循环执行reboot,2H后出现板子进900e,无法正常起来,屏幕不亮 to 【VC1】【BSP】【EVT】【power】【低概率】脚本循环执行reboot,2H后出现板子进900e,无法正常起来,屏幕不亮
- % Done changed from 60 to 80
- Resolution set to FIXED
- Root cause set to VC1 TP中断异常,开机一直触发终端,会导致系统相应不及时,引起crash
修改:
https://dev.thundercomm.com/gerrit/c/general/kernel/msm-4.19/+/186306
VB:
http://jenkins.thundercomm.com/job/VerifyBuild_for_IOT_6490/2863/
驗證方法:
1. 反復重啓:a.確認系統不會發生crash; b.重啓後tp觸摸正常。
2. 冷熱插拔,確認TP能正常觸摸。
3. 觸摸時,中斷數會增加。 不觸摸時,中斷數不增加
adb shell cat /proc/interrupts | grep atm
383: 426 0 0 0 0 0 0 0 msmgpio 7 Edge atmel_mxt_ts
#70 Updated by CD BSP-杜磊 over 2 years ago
- Status changed from ASSIGNED to RESOLVED
- Assignee changed from BSP组-RD3_TSCD 贾顺旺 to 移动测试一组_CDTS 刘强
#71 Updated by IoT scm over 2 years ago
ID | Project | Branch | Uploader |
186793 | general/kernel/msm-4.19 | Pre_figure_turbox-c2130c-la1.1-vendor-ftm_v0081 | dulei0503@thundersoft.com |
BSP:TP: Fix VC1 touch intrrupt abnormal issue Fix VC1 touch intrrupt abnormal issue TC-RID: 1201-0100401 IssueID: TS-R-DF-111971 Change-Id: I2b3a43c04c4bfca8f0adf75e7b7d11f6d90f1fdb |
#72 Updated by IoT scm over 2 years ago
ID | Project | Branch | Uploader |
186306 | general/kernel/msm-4.19 | Pre_figure_turbox-c2130c-la1.1-vendor-dev | dulei0503@thundersoft.com |
BSP:TP: Fix VC1 touch intrrupt abnormal issue Fix VC1 touch intrrupt abnormal issue TC-RID: 1201-0100401 IssueID: TS-R-DF-111971 Change-Id: I2b3a43c04c4bfca8f0adf75e7b7d11f6d90f1fdb |
#73 Updated by 移动测试一组_CDTS 刘强 over 2 years ago
压测:1000次未复现,继续压测中
#74 Updated by 移动测试一组_CDTS 刘强 over 2 years ago
- Status changed from RESOLVED to VERIFIED
目前压测2574次,未复现问题,暂时关闭此问题,后续日常测试中继续压测跟踪
#75 Updated by 移动测试一组_CDTS 刘强 over 2 years ago
- Status changed from VERIFIED to CLOSED