A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations

Qingyang Wang, Chien An Lai, Yasuhiko Kanemasa, Shungeng Zhang, Calton Pu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Scopus citations

Abstract

Long-tail latency of web-facing applications continues to be a serious problem. Most of the previously published research addresses two classes of long latency problems: uneven workloads such as web search, and resource saturation in single nodes. We describe an experimental study of a third class of long tail latency problemsthat are specific to distributed systems: Cross-Tier Queue Overflow (CTQO) due to a combination of millibottlenecks (with sub-second duration) and tightly-coupled servers in n-tier systems (e.g., Apache, Tomcat, and MySQL) using RPC-style request-response communications. Our experiments show that the appearance of millibottlenecks (e.g., created by short workload bursts) in one server often causes another server (which has no saturated resources) in the synchronous invocation chain to fill up its queues (CTQO) and drop packets, creating very long response time queries. CTQO can be reduced or avoided by replacing the server dropping packets with an asynchronous server. In synchronous n-tier system experiments, long tail latency due to CTQO can be reproduced consistently atutilization as low as 43%. In contrast, when all n-tier servers are replaced by asynchronous versions, CTQO and consequent dropped packets remain absent at utilization levels as high as 83%, despite the same millibottlenecks.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017
EditorsKisung Lee, Ling Liu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages207-217
Number of pages11
ISBN (Electronic)9781538617915
DOIs
StatePublished - Jul 13 2017
Externally publishedYes
Event37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 - Atlanta, United States
Duration: Jun 5 2017Jun 8 2017

Publication series

NameProceedings - International Conference on Distributed Computing Systems

Conference

Conference37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017
Country/TerritoryUnited States
CityAtlanta
Period6/5/176/8/17

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations'. Together they form a unique fingerprint.

Cite this