Bug #117300

【BSP】【EVT3】【ST】【压测】【客户验证】OTA升级后,进行KL15异常断电压测,压测120次后出现KL15上电设备无法开机,必须KL30整体断电一次后上电,才能正常启动

Added by 移动测试一组_CDTS 刘强 over 2 years ago. Updated over 1 year ago.

Status:CLOSEDStart date:2023-04-03
Priority:UrgentDue date:2023-04-14
Assignee:移动测试一组_CDTS 刘强% Done:

0%

Category:BSP
Target version:VC1_FSE_0094_20230425
Need_Info:-- Found Version:0085-0086
Resolution:FIXED Degrated:--
Severity:Critical Verified Version:
Reproducibility:Occasionally Fixed Version:
Test Type:ST Root cause:1)串口发送粘包,导致信令发送失败;
2)上下电时序没有加超时处理,当其他原因导致mpu上下电失败时,MCU一直阻塞在一个状态,不能跳出。

Description

【前提条件】
Mot V0.1.2

【测试步骤】
1. Qfile 烧写0085版本后设备开机(V0.1.1)
2. 在开机过程中进行单独KL15断电-上电(5次)(KL30不断电)
3. KL15再次上电等设备正常启动
4. push OTA升级,进行OTA升级到0086版本(V0.1.2)
5. OTA升级成功后设备自动重启时开始进行KL15单独断电-上电压测(10次,KL30不断电)
6. 再次KL15上电让设备正常开机
7. 重复step1-step6

【预期结果】
均能正常启动

【实际结果】
step6:KL15上电后,设备无法启动,电流约20mA
KL15无论怎么下电-上电都无法启动,KL30断电一次后,重新上电,设备正常启动,启动后进行MCU升级

figure-mcu-日志接口飞线.jpeg (221 KB) BSP组-RD3_TSCD 贾顺旺, 2023-04-13 21:27

v0.1.3.mot (311 KB) BSP组-RD3_TSCD 贾顺旺, 2023-04-13 21:27

正常关机指令帧数据.png (27.2 KB) BSP组-RD3_TSCD 贾顺旺, 2023-04-15 15:15

关机失效指令帧数据.png (24.4 KB) BSP组-RD3_TSCD 贾顺旺, 2023-04-15 15:15

teraterm-20230418-1730.log (3.07 MB) BSP组-RD3_TSCD 贾顺旺, 2023-04-18 17:33

Figure-mcu-%e6%97%a5%e5%bf%97%e6%8e%a5%e5%8f%a3%e9%a3%9e%e7%ba%bf %e6%ad%a3%e5%b8%b8%e5%85%b3%e6%9c%ba%e6%8c%87%e4%bb%a4%e5%b8%a7%e6%95%b0%e6%8d%ae %e5%85%b3%e6%9c%ba%e5%a4%b1%e6%95%88%e6%8c%87%e4%bb%a4%e5%b8%a7%e6%95%b0%e6%8d%ae

History

#1 Updated by CD BSP-杜磊 over 2 years ago

  • Status changed from New to ASSIGNED
  • Assignee changed from CD BSP-杜磊 to BSP组-RD3_TSCD 贾顺旺

#2 Updated by CD BSP-杜磊 over 2 years ago

  • Due date set to 2023-04-14

#3 Updated by 移动测试一组_CDTS 刘强 over 2 years ago

补充现象:
20230402:压测复现,KL15断电后,设备不关机,疑似同一问题

#4 Updated by 移动测试一组_CDTS 刘强 over 2 years ago

  • Severity changed from Normal to Major

#5 Updated by CD BSP-杜磊 over 2 years ago

  • Target version set to VC1_FSE_0094_20230425

#6 Updated by CDTS_TEST 王成 about 2 years ago

  • Severity changed from Major to Critical

#7 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

连接E1监测进行复测,复测10次未复现(由于连接E1不能进行OTA升级,所以取消了测试步骤中的OTA升级环节);
飞线mcu-uart-tx引脚,暂时未截取到串口通信数据,排查中。

#8 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

经核查原理图、设计文档、源码确定飞线位置没错,换板子飞线后串口输出正常;
对串口输出的稳定性进行压测调试,发现数据发送频繁时,有数据被覆盖现象,对此进行优化代码,目前数据发送基本完整;
正在再代码关键时序或事件位置中添加日志代码。

#9 Updated by CD BSP-杜磊 about 2 years ago

1. MCU 串口有连包问题,会导致串口通讯失败,代码已修改,review中。
2. KL15 无法开机唤醒无法开机/关机MPU,log已经添加完成,编VB进行压测。

#10 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

1.优化串口数据发送粘包问题;
2.boot和app程序增加串口日志打印;
3.请按照figure-mcu-日志接口飞线图片飞线出mcu-uart-tx引脚,连接该引脚与地信号,连接到串口线rx引脚和地引脚;
4.串口逻辑电平3.3v,串口波特率115200;
5.boot烧写使用附件文件v0.1.3
6.app升级bin文件使用如下两个vb版本
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-vendor-dev/MCU/20230413/143/
ftp://dvbuild:thundercomm@10.0.76.28/home/scm/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-vendor-dev/MCU/20230413/144/
7.请使用如上版本进行复测,并使用串口助手,接收和保存日志信息。

#12 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

通过正常关机指令帧数据和关机失效指令帧数据对比,分析出仍然是串口数据粘包导致的;
正常关机时,串口发送命令帧数据为:3C 07 00 02 14 82 AF ;
关机失效时,串口发送命令帧数据为:3C 07 00 02 0D 02 01 ;

1)增加串口发送中断计数检查,确保数据发送完成前,不对发送缓存、uart控制寄存器写操作;
2)将定时器中断中的串口发送,移到main/while(1)循环;
详细更改如下:
https://dev.thundercomm.com/gerrit/c/Pre_figure/mcu/+/193642

周一用新的VB版本进行测试验证:
1)验证时请继续检测MCU-uart-tx信号输出数据(这个版本没有输出文本日志,只有通信数据);
2)如果出现不能关机现象,请截取一下mup日志:adb logcat | grep uart

#13 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

20230415:最新vb仍然复现,研发继续调查中

#14 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

  • Priority changed from High to Urgent

根据讨论,提高优先级

#16 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

vb版本验证失败,继续调查

#17 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

检查kernel启动的gpio增加了超时设置,请重新测试一下VB版本:390和391

#18 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

1)
VB版本390和391测试,复现kl15不能开机现象:
ftp://cdiot@192.168.87.46/Figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20230417/202304171344-390/FlatBuild_HH_MCE_FSE.M.D.userdebug.01.00.C103%28X103%29.202304171344.zip
ftp://cdiot@192.168.87.46/Figure/VerifyBuild/Pre_figure_turbox-c2130c-la1.1-qssi12-dev/20230417/202304171345-391/FlatBuild_HH_MCE_FSE.M.D.userdebug.01.00.C103%28X103%29.202304171345.zip
2)
从MCU日志看,给MPU供电了,检测kenerl-gpio电平也显示kernel启动了。重新上电也不起,qfile刷版本后可以重启。
抓取MPU串口日志(teraterm-20230418-1730.log)显示:在内核启动22s时反复重启

3)MCU中增加MPU的增加系统不完全启动的超时处理。

#19 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

1)MCU中增加MPU的增加系统不完全开机的超时处理。
2)使用如下VB编译服务器,编译参数:ALL改为AP,勾选进入BP目录。
如下两个修改验证VB版本,测试累计8小时,运行正常,未复现问题:
http://jenkins.thundercomm.com/job/VerifyBuild_for_IOT_6490/3327/
http://jenkins.thundercomm.com/job/VerifyBuild_for_IOT_6490/3326/

3)下一步优化:开机超时处理中增加下电动作;

#20 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

开机超时处理中不能增加下电动作,有可能现在是9008烧机的状态,此时断电刷机会出错。

#21 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

重新编译新版本+开机动画卡死问题一起验证

#23 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

最新VB版本验证结果:
1. 通过PLC烧机后开始KL15断电压测36h,未复现问题
2. 通过信号检测仪压测约36h,未复现

下一步动作:
增加MCU升级过程压测

#24 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

20230423:使用信号仪继续压测包含MCU升级的场景,目前7次未复现

#25 Updated by IoT scm about 2 years ago

Gerrit Merge Information
ID Project Branch Uploader
193642 Pre_figure/mcu Pre_figure_turbox-c2130c-la1.1-vendor-dev
BSP:MCU: poweronoff failure with kl15
fix bug that poweronoff failure with kl15:
1. Record the byte count sent to determine if the
transmission is complete
2. correct serial send data buffer incomplete
3. R_Send_ProtectCmd move to main-while
4. add the function UART_SendLogMessageString()
5. add current state uart print
6. add timeout of wait kernel gpio on/off
7. Add timeout of Android reply normal startup
Bin version: V0.1.3
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ic2de62523433fb20acf50b912cafeae65e06ee48

#26 Updated by BSP组-RD3_TSCD 贾顺旺 about 2 years ago

  • Resolution changed from -- to FIXED
  • Root cause set to 1)串口发送粘包,导致信令发送失败; 2)上下电时序没有加超时处理,当其他原因导致mpu上下电失败时,MCU一直阻塞在一个状态,不能跳出。

#27 Updated by CD BSP-杜磊 about 2 years ago

  • Status changed from ASSIGNED to RESOLVED
  • Assignee changed from BSP组-RD3_TSCD 贾顺旺 to 移动测试一组_CDTS 刘强

#29 Updated by 移动测试一组_CDTS 刘强 about 2 years ago

  • Subject changed from 【BSP】【EVT3】【ST】【压测】OTA升级后,进行KL15异常断电压测,压测120次后出现KL15上电设备无法开机,必须KL30整体断电一次后上电,才能正常启动 to 【BSP】【EVT3】【ST】【压测】【客户验证】OTA升级后,进行KL15异常断电压测,压测120次后出现KL15上电设备无法开机,必须KL30整体断电一次后上电,才能正常启动

#30 Updated by 移动测试一组_CDTS 刘强 almost 2 years ago

  • Status changed from RESOLVED to VERIFIED

根据:https://thundersoft.feishu.cn/docx/DdV8dpUz1owXYDxMQrMcf4Lonyh
验证结果通过,暂时关闭

#31 Updated by 移动测试一组_CDTS 刘强 almost 2 years ago

  • Status changed from VERIFIED to CLOSED

#32 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
341023 Pre_figure/mcu Pre_figure_turbox-c2130c-la1.1-vendor-dev
BSP:MCU: Power off SOC and power on soc if SOC bootup timeout
Bin version: V0.2.3
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ib1682aeda65945c383cce9b20753669666686a3f

#33 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
342065 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-dev
BSP:MCU: Power off SOC and power on soc if SOC bootup timeout
Bin version: V0.2.3
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: I3cea95cb74cf0cbe3a779283e17b601c4f5cb528

#34 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
342008 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-release_sorp
BSP:MCU: Power off SOC and power on soc if SOC bootup timeout
Bin version: V0.2.3
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: I3cea95cb74cf0cbe3a779283e17b601c4f5cb528
(cherry picked from commit 7cb9e38248efe85191f9193b21d2202324de0a4b)

#35 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
343235 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-dev_vx1
BSP:MCU: Disable CAN reboot for MCE/VX1
Bin version: V0.2.4
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ia344eb52d0ea38e33f3ddb5047daf53503179f79

#36 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
343323 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-release_sorp_vx1
BSP:MCU: Disable CAN reboot for MCE/VX1
Bin version: V0.2.4
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ia344eb52d0ea38e33f3ddb5047daf53503179f79
(cherry picked from commit 796ca36283a5d8d8f1d712895adefd4e6f2980c6)

#37 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
343324 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-dev_mce
BSP:MCU: Disable CAN reboot for MCE/VX1
Bin version: V0.2.4
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ia344eb52d0ea38e33f3ddb5047daf53503179f79
(cherry picked from commit 796ca36283a5d8d8f1d712895adefd4e6f2980c6)

#38 Updated by IoT scm over 1 year ago

Gerrit Merge Information
ID Project Branch Uploader
343325 general/vendor/tc/mcu Pre_figure_turbox-c2130c-la1.1-vendor-release_sorp_mce
BSP:MCU: Disable CAN reboot for MCE/VX1
Bin version: V0.2.4
TC-RID: 1201-0102704
IssueID: TS-R-BUG-117300
Change-Id: Ia344eb52d0ea38e33f3ddb5047daf53503179f79
(cherry picked from commit 796ca36283a5d8d8f1d712895adefd4e6f2980c6)
(cherry picked from commit e1c81355d5de9986152d7befb3c076672f4a7b6d)

Also available in: Atom PDF