一:背景
Helios 系统要处理的数据量比较大,尤其是查询所有服务一天的评分数据时要返回每日 1440 分钟的所有应用的评分,总计有几十万个数据点,接口有时延迟会达到数秒。本文记录如何利用 Arthas ,将接口从几百几千 ms,优化到几十 ms。
从链路上看,线上获取一整天的数据时大概 300 多 ms,而查询数据库只有 11ms,说明大部分时间都是程序组装数据时消耗的,于是动起了优化代码的念头。
二:优化过程
代码业务可以不用去理解,最主要的是看利用trace优化的过程
2.1 初始未优化版本
2.1.1 代码
private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
HeliosGetScoreResponse response = new HeliosGetScoreResponse();
List heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
if (CollectionUtils.isEmpty(heliosScores)) {
return response;
}
Set dateSet = new HashSet();
Map groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
for (List value : groupByAppIdHeliosScores.values()) {
value.sort(Comparator.comparing(HeliosScore::getTimeFrom));
HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
score.setNamespace(value.get(0).getNamespace());
score.setAppId(value.get(0).getAppId());
for (HeliosScore heliosScore : value) {
List splitHeliosScores = heliosScore.split();
for (HeliosScore splitHeliosScore : splitHeliosScores) {
if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) 0) {
break;
}
dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom()));
if (splitHeliosScore.getScores() == null) {
splitHeliosScore.setScores("100");
log.error("查询时发现数据缺失: {}", heliosScore);
}
score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null);
}
}
response.getValues().add(score);
}
response.setDates(new ArrayList(dateSet).stream().sorted().collect(Collectors.toList()));
return response;
}
2.1.2 Arthas Trace
`---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@20864cd1
`---[4046.398447ms] xxxService.controller.HeliosController:queryScores()
+---[0.022259ms] xxxService.model.helios.HeliosGetScoreResponse:() #147
+---[0.007132ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #149
+---[0.006985ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #149
+---[0.008704ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #149
+---[19.284658ms] xxxService.service.HeliosService:queryScoresTimeBetween() #149
+---[0.017468ms] org.apache.commons.collections.CollectionUtils:isEmpty() #150
+---[0.008054ms] java.util.HashSet:() #154
+---[0.027591ms] java.util.List:stream() #156
+---[0.044229ms] java.util.stream.Collectors:groupingBy() #156
+---[0.155582ms] java.util.stream.Stream:collect() #156
+---[0.018318ms] java.util.Map:values() #157
+---[0.019199ms] java.util.Collection:iterator() #157
+---[min=3.51E-4ms,max=0.014266ms,total=0.125003ms,count=123] java.util.Iterator:hasNext() #157
+---[min=5.11E-4ms,max=0.010188ms,total=0.145693ms,count=122] java.util.Iterator:next() #157
+---[min=4.89E-4ms,max=0.045356ms,total=0.321978ms,count=122] java.util.Comparator:comparing() #158
+---[min=0.003637ms,max=0.033049ms,total=0.928795ms,count=122] java.util.List:sort() #158
+---[min=5.94E-4ms,max=0.010442ms,total=0.1485ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:() #159
+---[min=4.5E-4ms,max=0.010857ms,total=0.12773ms,count=122] java.util.List:get() #160
+---[min=5.01E-4ms,max=0.007849ms,total=0.123696ms,count=122] xxxService.helios.entity.HeliosScore:getNamespace() #160
+---[min=6.5E-4ms,max=0.007324ms,total=0.135906ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #160
+---[min=3.72E-4ms,max=0.010288ms,total=0.086703ms,count=122] java.util.List:get() #161
+---[min=5.1E-4ms,max=0.00627ms,total=0.103871ms,count=122] xxxService.helios.entity.HeliosScore:getAppId() #161
+---[min=5.97E-4ms,max=0.006531ms,total=0.126184ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #161
+---[min=4.45E-4ms,max=0.020198ms,total=0.138299ms,count=122] java.util.List:iterator() #162
+---[min=3.42E-4ms,max=0.014615ms,total=0.256056ms,count=366] java.util.Iterator:hasNext() #162
+---[min=3.59E-4ms,max=0.014974ms,total=0.174396ms,count=244] java.util.Iterator:next() #162
+---[min=0.071035ms,max=0.148132ms,total=19.444179ms,count=244] xxxService.helios.entity.HeliosScore:split() #163
+---[min=4.06E-4ms,max=0.022364ms,total=0.210152ms,count=244] java.util.List:iterator() #164
+---[min=3.07E-4ms,max=0.199649ms,total=143.267893ms,count=351604] java.util.Iterator:hasNext() #164
+---[min=3.25E-4ms,max=24.863976ms,total=177.15363ms,count=351360] java.util.Iterator:next() #164
+---[min=3.93E-4ms,max=0.096771ms,total=176.843018ms,count=351360] xxxService.helios.entity.HeliosScore:getTimeFrom() #165
+---[min=4.07E-4ms,max=18.772715ms,total=205.632183ms,count=351360] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #165
+---[min=3.33E-4ms,max=0.045589ms,total=149.24486ms,count=351360] java.util.Date:compareTo() #165
+---[min=3.93E-4ms,max=0.032972ms,total=86.466793ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #168
+---[min=4.12E-4ms,max=0.061003ms,total=94.294061ms,count=175680] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #168
+---[min=3.37E-4ms,max=0.038792ms,total=74.505056ms,count=175680] java.util.Date:compareTo() #168
+---[min=3.97E-4ms,max=0.036548ms,total=87.693935ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #171
1 +---[min=0.001952ms,max=0.068413ms,total=391.739063ms,count=175680] xxxService.utils.DateUtils$yyyyMMddHHmm:formatDate() #171
+---[min=4.07E-4ms,max=0.037904ms,total=108.107714ms,count=175680] java.util.Set:add() #171
+---[min=3.95E-4ms,max=0.031555ms,total=88.173857ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #172
+---[min=3.88E-4ms,max=0.033584ms,total=84.689466ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #176
+---[min=3.11E-4ms,max=0.038121ms,total=69.708752ms,count=175680] java.lang.Math:max() #176
+---[min=4.66E-4ms,max=0.03391ms,total=104.476576ms,count=175680] xxxService.model.helios.HeliosGetScoreResponse$Score:add() #176
+---[min=6.17E-4ms,max=0.01503ms,total=0.159826ms,count=122] xxxService.model.helios.HeliosGetScoreResponse:getValues() #179
+---[min=6.44E-4ms,max=0.03742ms,total=0.21068ms,count=122] java.util.List:add() #179
+---[0.108961ms] java.util.ArrayList:() #182
+---[0.017455ms] java.util.ArrayList:stream() #182
+---[0.011099ms] java.util.stream.Stream:sorted() #182
+---[0.013699ms] java.util.stream.Collectors:toList() #182
+---[0.38178ms] java.util.stream.Stream:collect() #182
`---[0.004627ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #182
2.1.3 分析
Arthas 显示总共花了 4 秒,但实际上在链路上看大概是 350~450ms 左右。其他多出来的时间是 Arthas 每一次执行统计的消耗,因为方法里的循环比较多。这也告诉我们,不要用 trace 去看循环很多的方法。会对性能有非常严重的影响。
可以看出整个函数有 3 个循环,第一层循环的数量为 appId 的数量约为 140,第二层是查出来的数据条数,一天的数据已经归并了所以这里应该是 1,第三层是时间区间的分钟数,一天的话就是 1440 个。
Trace 中可以看到消耗最多的是封装的一个 SimpleDateFormat.formatDate()
。
2.2 第一次优化
2.2.1 优化方向
Set dateSet
改为 Set
,这样减少反复 formatDate() 的开销。Integer.parseInt
方法调用,改为用 Map
提前创建出 0~100 的字符串数字字典。(后来经过 JMH 测试,还是 Integer.parseInt
最快)2.2.2 代码
private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
HeliosGetScoreResponse response = new HeliosGetScoreResponse();
List heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
if (CollectionUtils.isEmpty(heliosScoresRecord)) {
return response;
}
Set dateSet = new HashSet();
List heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);
Map groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
for (List scores : groupByAppIdHeliosScores.values()) {
HeliosScore heliosScore = scores.get(0);
HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
score.setNamespace(heliosScore.getNamespace());
score.setAppId(heliosScore.getAppId());
score.setScores(new ArrayList());
response.getValues().add(score);
List scoreIntList = HeliosHelper.splitScores(heliosScore);
// 以 requestTime 为准
Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime());
int index = 0;
// 如果 timeFrom 0) {
heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000);
index++;
}
while (indexDate.getTime().compareTo(request.getEndTime())