TY - JOUR
T1 - Mitigating tail response time of n-tier applications
T2 - The impact of asynchronous invocations
AU - Wang, Qingyang
AU - Zhang, Shungeng
AU - Kanemasa, Yasuhiko
AU - Pu, Calton
N1 - Funding Information:
This research has been partially funded by National Science Foundation by CISE's CNS (Grants No. 1566443 and No. 1421561), SAVI/RCN (Grants No. 1402266 and No. 1550379), CRISP (Grant No. 1541074), SaTC (Grant No. 1564097) programs, an REU supplement (Grant No. 1545173), Louisiana Board of Regents under Grant No. LEQSF (2015-18)-RD-A-11, and gifts, grants, or contracts from Fujitsu, HP, Intel, and Georgia Tech Foundation through the John P. Imlay, Jr. Chair endowment. Any opinions, findings, and conclusions or recommendations expressed in thismaterial are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other funding agencies and companiesmentioned above.
Funding Information:
This research has been partially funded by National Science Foundation by CISE’s CNS (Grants No. 1566443 and No. 1421561), SAVI/RCN (Grants No. 1402266 and No. 1550379), CRISP (Grant No. 1541074), SaTC (Grant No. 1564097) programs, an REU supplement (Grant No. 1545173), Louisiana Board of Regents under Grant No. LEQSF (2015-18)-RD-A-11, and gifts, grants, or contracts from Fujitsu, HP, Intel, and Georgia Tech Foundation through the John P. Imlay, Jr. Chair endowment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other funding agencies and companies mentioned above. Authors’ addresses: Q. Wang and S. Zhang, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, 3325 Patrick F. Taylor Hall, LA 70803, USA; emails: {qwang26, szhan45}@lsu.edu; Y. Kanemasa, Fujitsu Laboratories LTD., 1-1, Kamikodanaka 4-chome, Nakahara-ku, Kawasaki 211-8588, Japan; email: kanemasa@jp.fujitsu.com; C. Pu, College of Computing, Georgia Institute of Technology, 266 Ferst Dr, Atlanta, GA 30332-0765, USA; email: calton@ cc.gatech.edu. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2019 Association for Computing Machinery. 1533-5399/2019/07-ART36 $15.00 https://doi.org/10.1145/3340462
Publisher Copyright:
© 2019 Association for Computing Machinery. © 2019 Association for Computing Machinery.
PY - 2019/10
Y1 - 2019/10
N2 - Consistent low response time is essential for e-commerce due to intense competitive pressure. However, practitioners of web applications have often encountered the long-tail response time problem in cloud data centers as the system utilization reaches moderate levels (e.g., 50%). Our fine-grained measurements of an open source n-tier benchmark application (RUBBoS) show such long response times are often caused by Cross-tier Queue Overflow (CTQO). Our experiments reveal the CTQO is primarily created by the synchronous nature of RPC-style call/response inter-tier communications, which create strong inter-tier dependencies due to the request processing chain of classic n-tier applications composed of synchronous RPC/thread-based servers. We remove gradually the dependencies in n-tier applications by replacing the classic synchronous servers (e.g., Apache, Tomcat, and MySQL) with their corresponding event-driven asynchronous version (e.g., Nginx, XTomcat, and XMySQL) one-by-one. Our measurements with two application scenarios (virtual machine co-location and background monitoring interference) show that replacing a subset of asynchronous servers will shift the CTQO, without significant improvements in long-tail response time. Only when all the servers become asynchronous the CTQO is resolved. In synchronous n-tier applications, long-tail response times resulting from CTQO arise at utilization as low as 43%. On the other hand, the completely asynchronous n-tier system can disrupt CTQO and remove the long tail latency at utilization as high as 83%.
AB - Consistent low response time is essential for e-commerce due to intense competitive pressure. However, practitioners of web applications have often encountered the long-tail response time problem in cloud data centers as the system utilization reaches moderate levels (e.g., 50%). Our fine-grained measurements of an open source n-tier benchmark application (RUBBoS) show such long response times are often caused by Cross-tier Queue Overflow (CTQO). Our experiments reveal the CTQO is primarily created by the synchronous nature of RPC-style call/response inter-tier communications, which create strong inter-tier dependencies due to the request processing chain of classic n-tier applications composed of synchronous RPC/thread-based servers. We remove gradually the dependencies in n-tier applications by replacing the classic synchronous servers (e.g., Apache, Tomcat, and MySQL) with their corresponding event-driven asynchronous version (e.g., Nginx, XTomcat, and XMySQL) one-by-one. Our measurements with two application scenarios (virtual machine co-location and background monitoring interference) show that replacing a subset of asynchronous servers will shift the CTQO, without significant improvements in long-tail response time. Only when all the servers become asynchronous the CTQO is resolved. In synchronous n-tier applications, long-tail response times resulting from CTQO arise at utilization as low as 43%. On the other hand, the completely asynchronous n-tier system can disrupt CTQO and remove the long tail latency at utilization as high as 83%.
KW - Asynchronous
KW - Cloud computing
KW - n-tier systems
KW - Performance
KW - Scalability
UR - http://www.scopus.com/inward/record.url?scp=85074827588&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074827588&partnerID=8YFLogxK
U2 - 10.1145/3340462
DO - 10.1145/3340462
M3 - Article
AN - SCOPUS:85074827588
VL - 19
JO - ACM Transactions on Internet Technology
JF - ACM Transactions on Internet Technology
SN - 1533-5399
IS - 3
M1 - 36
ER -