Bug #118117

【IOVDEV-29505】【内部】【VC1】【0095】【台架】【EE环境】【副屏】副屏桌面偶发一次冻屏,恢复之后提示“Quickstepmei没有响应”

Added by CDTS_Test 吴诗雨 about 2 years ago. Updated about 2 years ago.

Status:CLOSEDStart date:2023-05-19
Priority:HighDue date:2023-05-26
Assignee:CDTS_Test 吴诗雨% Done:

0%

Category:日志不全无法分析
Target version:-
Need_Info:-- Found Version:副屏:01.00.0095.c103
Resolution:DUPLICATE Degrated:No
Severity:Major Verified Version:
Reproducibility:Rarely Fixed Version:2023-06-27
Test Type:ST Root cause:

Description

【环境信息】EE环境
【问题出现时间】:20230519 13点55分( 第149次开机)
【前提条件】:车机上电(KL30ON ,KL15 ON,已连接中控和功放,可正常输出声音)
1.副屏点击进入爱奇艺,没有正常进入其主页,则多任务管理器中杀掉任务重新点击桌面的爱奇艺

2.重复步骤1多次

【实际结果】

副屏返回桌面时偶发冻屏,约2分钟后恢复,并弹窗提示“Quickstep没有响应”

【期望结果】
副屏正常运行,无冻屏现象。
【备注】:

此问题出现一次,约2分钟后自行恢复。

车架号VIN:HRYTTESTVINMGM400

log地址:https://ofs.human-horizons.com/#/download/index/EBFI2CCScTI%3D

视频地址:https://ofs.human-horizons.com/#/download/index/MmJMfD5KJ0c%3D

【影响版本】:副屏:01.00.0095.c103
【硬件版本号】:/
【是否能恢复】:可恢复

桌面偶发一次冻屏.mp4 (16.8 MB) CDTS_Test 吴诗雨, 2023-05-19 20:16

桌面偶发一次冻屏恢复后,提示进程quickstep没有响应.mp4 (5.83 MB) CDTS_Test 吴诗雨, 2023-05-19 20:16

1684753000512.zip (4.28 MB) 物联网项目组-RD3_CDTS 周飞, 2023-05-22 18:58

1684752984491.zip (3.14 MB) 物联网项目组-RD3_CDTS 周飞, 2023-05-22 18:58

1684824972431.zip (14.4 MB) CDTS-TEST 周婷, 2023-05-23 15:01

29505Log.zip (7.46 MB) 物联网项目组-RD3_CDTS 周飞, 2023-05-23 18:43

History

#2 Updated by CD APP-王营 about 2 years ago

  • Status changed from New to NEED_INFO
  • Assignee changed from CD APP-王营 to CDTS_Test 吴诗雨

【当前状态】
1、从视频现象来看Launcher发生ANR
2、根据上传的log,未能获取到anr相关log,Tlog中也没有trace文件(即anr文件)
【下一步】
1、需要测试复现问题,通过正确的方式抓取Tlog,当前log应该是抓了实时日志,应该将历史log抓全,具体操作方式请与Tlog研发确认。

#4 Updated by CDTS-TEST 周婷 about 2 years ago

  • Assignee changed from CDTS_Test 吴诗雨 to CD APP-王营

请从新添加日志继续分析

#5 Updated by CD APP-王营 about 2 years ago

  • Assignee changed from CD APP-王营 to CDTS_Test 吴诗雨

【当前状态】
Log没抓全,Launcher出现ANR,但是没有对应trace文件

【分析】
根据压缩包1684752984491.zip中的log可知
05-19 13:55:56.156 1544 10387 E ActivityManager: ANR in com.qiyi.video

05-19 13:57:08.012 1544 1768 I WindowManager: ANR in gesture monitor owned by pid:2512. Reason: swipe-up (server) is not responding. Waited 86002ms for MotionEvent

05-19 13:57:08.120 1544 1768 I WindowManager: ANR in com.android.launcher3/com.android.launcher3.uioverrides.QuickstepLauncher. Reason:2949388 com.android.launcher3/com.android.launcher3.uioverrides.QuickstepLauncher (server) is not responding. Waited 84857ms for MotionEvent

05-19 13:57:08.156 1544 10387 I ActivityManager: Killing 9853:com.qiyi.video/u0a97 (adj 0): bg anr

05-19 13:57:08.156 1544 10387 I am_kill : [0,9853,com.qiyi.video,0,bg anr]

05-19 13:57:08.159 1544 10387 D ActivityManager: Completed ANR of com.qiyi.video in 110753ms, latency 0ms

在Launcher发生ANR之前爱奇艺应用先发生了anr
此log压缩包下没有trace文件,需根据trace文件进一步分析anr产生原因

压缩包 1684753000512.zip 中存在trace文件,但是这些anr文件是18号的,跟当前问题不一致

【下一步】

请帮忙复现此问题,抓取完整的Tlog。

#6 Updated by CDTS-TEST 周婷 about 2 years ago

  • File 1684824972431.zip added
  • Status changed from NEED_INFO to ASSIGNED
  • Assignee changed from CDTS_Test 吴诗雨 to CD APP-王营

已更新最新云log 请重新分析,谢谢

#7 Updated by CD APP-王营 about 2 years ago

【状态】
根据 1684824972431.zip 中log分析,与之前的log相同Launcher出现anr的Logcat如下:
05-19 13:57:08.120 1544 1768 I WindowManager: ANR in com.android.launcher3/com.android.launcher3.uioverrides.QuickstepLauncher. Reason:2949388 com.android.launcher3/com.android.launcher3.uioverrides.QuickstepLauncher (server) is not responding. Waited 84857ms for MotionEvent

时间点一致,在对应目录下没有trace文件,anr文件夹下为空

Launcher的anr产生是在爱奇艺anr发生之后,目前分析来看有可能是爱奇艺的anr导致了后面的anr发生,具体爱奇艺发生anr的原因同样依赖trace文件

【下一步】

复现问题,抓取trace文件

#8 Updated by CD APP-王营 about 2 years ago

  • Status changed from ASSIGNED to NEED_INFO
  • Assignee changed from CD APP-王营 to CDTS_Test 吴诗雨

#9 Updated by 物联网项目组-RD3_CDTS 周飞 about 2 years ago

  • File 29505Log.zip added
  • Status changed from NEED_INFO to ASSIGNED
  • Assignee changed from CDTS_Test 吴诗雨 to CD APP-王营

#10 Updated by CD APP-王营 about 2 years ago

  • Status changed from ASSIGNED to NEED_INFO
  • Assignee changed from CD APP-王营 to 物联网项目组-RD3_CDTS 周飞

【当前状态】

最新上传的log中(29505Log.zip)依然没有Launcher anr的trace文件,分析了logcat有以下异常

1、Launcher发生ANR之前1分钟,爱奇艺发生anr,具体log如下:

05-19 13:55:56.156 1544 10387 E ActivityManager: ANR in com.qiyi.video
05-19 13:55:56.156 1544 10387 E ActivityManager: PID: 9853
05-19 13:55:56.156 1544 10387 E ActivityManager: Reason: executing service com.qiyi.video/org.qiyi.video.nativelib.pm.LibraryManagerService
05-19 13:55:56.156 1544 10387 E ActivityManager: Frozen: false

...

05-19 13:55:56.156 1544 10387 E ActivityManager: CPU usage from 0ms to 29825ms later (2023-05-19 13:55:26.277 to 2023-05-19 13:55:56.102):
05-19 13:55:56.156 1544 10387 E ActivityManager: 99% 2771/hht_ota_mgr: 41% user + 58% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 12% 1544/system_server: 8.3% user + 4.2% kernel / faults: 143832 minor
05-19 13:55:56.156 1544 10387 E ActivityManager: 7.8% 2512/com.android.launcher3: 6.4% user + 1.3% kernel / faults: 18157 minor

...

05-19 13:55:56.156 1544 10387 E ActivityManager: CPU usage from 37ms to 374ms later (2023-05-19 13:55:26.314 to 2023-05-19 13:55:26.651):
05-19 13:55:56.156 1544 10387 E ActivityManager: 98% 2771/hht_ota_mgr: 47% user + 51% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 98% 2965/event_loop: 47% user + 51% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 32% 1544/system_server: 14% user + 18% kernel / faults: 416 minor
05-19 13:55:56.156 1544 10387 E ActivityManager: 25% 10387/AnrConsumer: 10% user + 14% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 10% 852/vendor.qti.hardware.display.composer-service: 6.8% user + 3.4% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 17% 852/composer-servic: 10% user + 6.8% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 6.9% 926/surfaceflinger: 6.9% user + 0% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 6.9% 926/surfaceflinger: 6.9% user + 0% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 2.9% 32/rcuop/2: 0% user + 2.9% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 3% 48/rcuop/4: 0% user + 3% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 3.1% 388/crtc_commit:125: 0% user + 3.1% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 3.2% 514/lmkd: 0% user + 3.2% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 3.2% 514/lmkd: 0% user + 3.2% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 3.6% 1331/msm_irqbalance: 0% user + 3.6% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 4.1% 8364/com.tencent.android.qqdownloader:daemon: 4.1% user + 0% kernel / faults: 9 minor
05-19 13:55:56.156 1544 10387 E ActivityManager: 4.1% 8731/FloatingWindow-: 0% user + 4.1% kernel
05-19 13:55:56.156 1544 10387 E ActivityManager: 24% TOTAL: 10% user + 13% kernel + 0.3% irq
05-19 13:55:56.157 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.169 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.180 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.192 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.204 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.218 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.229 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.241 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.252 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.257 1193 1193 I CNSS : cnss_open_log_file: directory /data/vendor/wifi/wlan_logs/ created
05-19 13:55:56.265 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.276 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok
05-19 13:55:56.280 1544 8054 W ActivityManager: Missing app error report, app = com.qiyi.video crashing = false notResponding = true
05-19 13:55:56.280 9853 9894 E xcrash : processErrorList is null !!!! poll = 50, i = 48

...

在anr过程中 hht_ota_mgr这个应用占用率过高,请客户帮忙确认此应用异常,lowmemorykiller持续打印。

2、Launcher 发生anr的时间点如下

05-19 13:57:08.012 1544 1768 I WindowManager: ANR in gesture monitor owned by pid:2512. Reason: swipe-up (server) is not responding. Waited 86002ms for MotionEvent

【下一步】

1、客户确认hht_ota_mgr应用CPU占用率过高原因。
2、复现问题,或取trace文件

#11 Updated by CDTS_TEST 王成 about 2 years ago

  • Fixed Version set to 2023-05-26

#12 Updated by CDTS_TEST 王成 about 2 years ago

  • Due date set to 2023-05-26

#13 Updated by CDTS-TEST 周婷 about 2 years ago

  • Category changed from CD-APP to 日志不全无法分析

#14 Updated by CD APP-王营 about 2 years ago

  • Status changed from NEED_INFO to ASSIGNED
  • Assignee changed from 物联网项目组-RD3_CDTS 周飞 to CD APP-王营

#15 Updated by CD APP-王营 about 2 years ago

【状态】
分析中

【分析】

整体的问题流程中存在3个异常,三个异常发生的时间点整理如下:

1、hht_ota_mgr应用占用率过高
05-19 13:55:20.191 514 514 I lowmemorykiller: Reclaimed 0kB at oom_score_adj 606
05-19 13:55:21.158 0 0 W audit : audit_lost=76271452 audit_rate_limit=5 audit_backlog_limit=64
05-19 13:55:21.158 0 0 E audit : rate limit exceeded
05-19 13:55:20.371 2771 2771 I auditd : type=1400 audit(0.0:76273720): avc: denied { read } for comm="event_loop" lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 W event_loop: type=1400 audit(0.0:76273720): avc: denied { read } for lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 I auditd : type=1400 audit(0.0:76273721): avc: denied { read } for comm="event_loop" lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 W event_loop: type=1400 audit(0.0:76273721): avc: denied { read } for lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 I auditd : type=1400 audit(0.0:76273722): avc: denied { read } for comm="event_loop" lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 W event_loop: type=1400 audit(0.0:76273722): avc: denied { read } for lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 I auditd : type=1400 audit(0.0:76273723): avc: denied { read } for comm="event_loop" lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 W event_loop: type=1400 audit(0.0:76273723): avc: denied { read } for lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 I auditd : type=1400 audit(0.0:76273724): avc: denied { read } for comm="event_loop" lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0
05-19 13:55:20.371 2771 2771 W event_loop: type=1400 audit(0.0:76273724): avc: denied { read } for lport=13400 scontext=u:r:hht_ota_mgr:s0 tcontext=u:r:hht_ota_mgr:s0 tclass=udp_socket permissive=0

2、爱奇艺发生ANR,时间点如下
05-19 13:55:56.156 1544 10387 E ActivityManager: ANR in com.qiyi.video
此时hht_ota_mgr应用持续CPU占用率过高,占用率情况参考上一条更新

3、Launcher 发生ANR
05-19 13:57:08.120 1544 1768 I WindowManager: ANR in com.android.launcher3/com.android.launcher3.uioverrides.QuickstepLauncher

根据以上时间点,hht_ota_mgr应用最先出现了异常,需要优先处理问题,待hht_ota_mgr应用稳定后尝试复现Launcher anr问题并获取关键log。

【下一步】
1、需要客户排查hht_ota_mgr 异常问题
2、Launcher尝试复现anr 抓取trace

#16 Updated by CD APP-王营 about 2 years ago

【状态】
1、根据当时的log,目前无法确定OTA程序占用CPU核数,对于内存占用情况正在进一步分析。
2、Log中没有anr产生时的trace文件,目前继续根据现有Logcat梳理测试步骤和代码执行流程,Launcher发生ANR是因为手势等了86秒都无响应,而Launcher ANR之前1分钟爱奇艺发生ANR,Launcher的手势此时无法响应,需要进一步分析爱奇艺anr的产生原因和影响。
3、本地尝试了100次操作,目前未能复现此问题,需要确认当时的测试手法和环境来模拟出此问题,拿到trace文件进行分析。

【下一步】
1、继续尝试复现此问题,抓取关键log进行排查。

#17 Updated by CD APP-王营 about 2 years ago

【分析】
1、Launcher发生Anr的reason:

05-19 13:57:08.012 1544 1768 I WindowManager: ANR in gesture monitor owned by pid:2512. Reason: swipe-up (server) is not responding. Waited 86002ms for MotionEvent
86.002秒前执行上滑手势,Launcher无法响应

根据时间点向前推,对应的gesture手势触发log如下:

05-19 13:55:43.458 1872 1872 D ViewTouchDebug: mView : com.android.systemui.navigationbar.NavigationBarFrame{6bee75b V.E...... ........ 0,0-2560,46 #7f0b03f2 app:id/navigation_bar_frame}Receive event : MotionEvent { action=ACTION_OUTSIDE, actionButton=0, id0=0, x0=19.0, y0=-491.0, toolType0=TOOL_TYPE_FINGER, buttonState=0, classification=NONE, metaState=0, flags=0x0, edgeFlags=0x0, pointerCount=1, historySize=0, eventTime=984117, downTime=984117, deviceId=4, source=0x5002, displayId=0, eventId=1849843003 }
05-19 13:55:43.460 514 514 I lowmemorykiller: Ignoring pressure since per-zone watermarks ok

在向上查看log,爱奇艺出现crash

05-19 13:55:43.248 1544 8054 W ActivityManager: Missing app error report, app = com.qiyi.video crashing = false notResponding = true

在整个问题发生中都伴随lowmemorykiller,不排除内存紧张的情况

【下一步】

继续排查log,查看异常产生原因和lowmemoryKiller触发的原因。

#18 Updated by CD APP-王营 about 2 years ago

PM已经拉通客户测试。正在复现中,抓取anr后的trace,同时会起一个子任务来跟踪hht_ota_mgr占用率高的问题

#19 Updated by CD APP-王营 about 2 years ago

  • Status changed from ASSIGNED to RESOLVED
  • Assignee changed from CD APP-王营 to CDTS_Test 吴诗雨
  • Resolution changed from -- to DUPLICATE
  • Degrated changed from -- to No
  • Fixed Version changed from 2023-05-26 to 2023-06-27

【分析】
此问题已经在JIRA跟踪关闭,针对ANR问题已经做出以下系统优化
1、控制后台任务运行数量。
2、ota_hht_mgr优化。

【下一步】
请测试跟踪关闭此问题

#20 Updated by CDTS_Test 吴诗雨 about 2 years ago

  • Status changed from RESOLVED to VERIFIED

#21 Updated by CDTS_Test 吴诗雨 about 2 years ago

  • Status changed from VERIFIED to CLOSED

Also available in: Atom PDF