A framework for elastic execution of existing MPI programs

Aarthi Raveendran; Tekin Bicer; Gagan Agrawal

doi:10.1109/IPDPS.2011.240

A framework for elastic execution of existing MPI programs

Aarthi Raveendran, Tekin Bicer, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

44 Scopus citations

Abstract

There is a clear trend towards using cloud resources in the scientific or the HPC community, with a key attraction of cloud being the elasticity it offers. In executing HPC applications on a cloud environment, it will clearly be desirable to exploit elasticity of cloud environments, and increase or decrease the number of instances an application is executed on during the execution of the application, to meet time and/or cost constraints. Unfortunately, HPC applications have almost always been designed to use a fixed number of resources. This paper describes our initial work towards the goal of making existing MPI applications elastic for a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The components of our envisioned system include a decision layer which considers time and cost constraints, a framework for modifying MPI programs, and a cloud-based runtime support that can enable redistributing of saved data, and support automated resource allocation and application restart on a different number of nodes. Using two MPI applications, we demonstrate the feasibility of our approach, and show that outputting, redistributing, and reading back data can be a reasonable approach for making existing MPI applications elastic.

Original language	English (US)
Title of host publication	2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011
Pages	940-947
Number of pages	8
DOIs	https://doi.org/10.1109/IPDPS.2011.240
State	Published - 2011
Externally published	Yes
Event	25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011 - Anchorage, AK, United States Duration: May 16 2011 → May 20 2011

Publication series

Name	IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

Conference

Conference	25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011
Country/Territory	United States
City	Anchorage, AK
Period	5/16/11 → 5/20/11

ASJC Scopus subject areas

Computational Theory and Mathematics
Software
Theoretical Computer Science

Access to Document

10.1109/IPDPS.2011.240

Cite this

Raveendran, A., Bicer, T., & Agrawal, G. (2011). A framework for elastic execution of existing MPI programs. In 2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011 (pp. 940-947). Article 6008941 (IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum). https://doi.org/10.1109/IPDPS.2011.240

A framework for elastic execution of existing MPI programs. / Raveendran, Aarthi; Bicer, Tekin; Agrawal, Gagan.
2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011. 2011. p. 940-947 6008941 (IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Raveendran, A, Bicer, T & Agrawal, G 2011, A framework for elastic execution of existing MPI programs. in 2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011., 6008941, IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 940-947, 25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011, Anchorage, AK, United States, 5/16/11. https://doi.org/10.1109/IPDPS.2011.240

@inproceedings{931a7998e7fb4ef6926326e9bda4234b,

title = "A framework for elastic execution of existing MPI programs",

abstract = "There is a clear trend towards using cloud resources in the scientific or the HPC community, with a key attraction of cloud being the elasticity it offers. In executing HPC applications on a cloud environment, it will clearly be desirable to exploit elasticity of cloud environments, and increase or decrease the number of instances an application is executed on during the execution of the application, to meet time and/or cost constraints. Unfortunately, HPC applications have almost always been designed to use a fixed number of resources. This paper describes our initial work towards the goal of making existing MPI applications elastic for a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The components of our envisioned system include a decision layer which considers time and cost constraints, a framework for modifying MPI programs, and a cloud-based runtime support that can enable redistributing of saved data, and support automated resource allocation and application restart on a different number of nodes. Using two MPI applications, we demonstrate the feasibility of our approach, and show that outputting, redistributing, and reading back data can be a reasonable approach for making existing MPI applications elastic.",

author = "Aarthi Raveendran and Tekin Bicer and Gagan Agrawal",

year = "2011",

doi = "10.1109/IPDPS.2011.240",

language = "English (US)",

isbn = "9780769543857",

series = "IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum",

pages = "940--947",

booktitle = "2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011",

note = "25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011 ; Conference date: 16-05-2011 Through 20-05-2011",

}

TY - GEN

T1 - A framework for elastic execution of existing MPI programs

AU - Raveendran, Aarthi

AU - Bicer, Tekin

AU - Agrawal, Gagan

PY - 2011

Y1 - 2011

N2 - There is a clear trend towards using cloud resources in the scientific or the HPC community, with a key attraction of cloud being the elasticity it offers. In executing HPC applications on a cloud environment, it will clearly be desirable to exploit elasticity of cloud environments, and increase or decrease the number of instances an application is executed on during the execution of the application, to meet time and/or cost constraints. Unfortunately, HPC applications have almost always been designed to use a fixed number of resources. This paper describes our initial work towards the goal of making existing MPI applications elastic for a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The components of our envisioned system include a decision layer which considers time and cost constraints, a framework for modifying MPI programs, and a cloud-based runtime support that can enable redistributing of saved data, and support automated resource allocation and application restart on a different number of nodes. Using two MPI applications, we demonstrate the feasibility of our approach, and show that outputting, redistributing, and reading back data can be a reasonable approach for making existing MPI applications elastic.

AB - There is a clear trend towards using cloud resources in the scientific or the HPC community, with a key attraction of cloud being the elasticity it offers. In executing HPC applications on a cloud environment, it will clearly be desirable to exploit elasticity of cloud environments, and increase or decrease the number of instances an application is executed on during the execution of the application, to meet time and/or cost constraints. Unfortunately, HPC applications have almost always been designed to use a fixed number of resources. This paper describes our initial work towards the goal of making existing MPI applications elastic for a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The components of our envisioned system include a decision layer which considers time and cost constraints, a framework for modifying MPI programs, and a cloud-based runtime support that can enable redistributing of saved data, and support automated resource allocation and application restart on a different number of nodes. Using two MPI applications, we demonstrate the feasibility of our approach, and show that outputting, redistributing, and reading back data can be a reasonable approach for making existing MPI applications elastic.

UR - http://www.scopus.com/inward/record.url?scp=83455263660&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=83455263660&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2011.240

DO - 10.1109/IPDPS.2011.240

M3 - Conference contribution

AN - SCOPUS:83455263660

SN - 9780769543857

T3 - IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

SP - 940

EP - 947

BT - 2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011

T2 - 25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011

Y2 - 16 May 2011 through 20 May 2011

ER -

A framework for elastic execution of existing MPI programs

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this