dMPI: Facilitating Debugging of MPI Programs via Deterministic Message Passing

التفاصيل البيبلوغرافية
العنوان: dMPI: Facilitating Debugging of MPI Programs via Deterministic Message Passing
المؤلفون: Zhou, Xu, Lu, Kai, Lu, Xicheng, Wang, Xiaoping, Fan, Baohua
المساهمون: National University of Defense Technology China, James J. Park, Albert Zomaya, Sang-Soo Yeo, Sartaj Sahni, TC 10, WG 10.3
المصدر: Lecture Notes in Computer Science ; 9th International Conference on Network and Parallel Computing (NPC) ; https://hal.inria.fr/hal-01551348 ; 9th International Conference on Network and Parallel Computing (NPC), Sep 2012, Gwangju, South Korea. pp.172-179, ⟨10.1007/978-3-642-35606-3_20⟩
بيانات النشر: HAL CCSD
Springer
سنة النشر: 2012
مصطلحات موضوعية: [INFO]Computer Science [cs]
جغرافية الموضوع: Gwangju, South Korea
الوصف: Part 4: Parallel, Distributed, and Virtualization Techniques ; International audience ; This paper presents a novel deterministic MPI implementation (dMPI) to facilitate the debugging of MPI programs. Distinct from existing approaches, dMPI ensures inherent determinism without using any external support (e.g., logs), which achieves convenience and performance simultaneously. The basic idea of dMPI is to use deterministic logical time to solve message races and control asynchronous transmissions, thus we could eliminate the nondeterministic behaviors of the existing message passing mechanism. To avoid deadlocks introduced by dMPI, we also integrate dMPI with a lightweight deadlock checker to dynamically detect and solve these deadlocks. We have implemented dMPI and evaluated it using NPB benchmarks. The results show that dMPI could guarantee determinism with incurring modest overhead (8% on average).
نوع الوثيقة: conference object
اللغة: English
Relation: hal-01551348; https://hal.inria.fr/hal-01551348; https://hal.inria.fr/hal-01551348/document; https://hal.inria.fr/hal-01551348/file/978-3-642-35606-3_20_Chapter.pdf
DOI: 10.1007/978-3-642-35606-3_20
الاتاحة: https://hal.inria.fr/hal-01551348
https://hal.inria.fr/hal-01551348/document
https://hal.inria.fr/hal-01551348/file/978-3-642-35606-3_20_Chapter.pdf
https://doi.org/10.1007/978-3-642-35606-3_20
Rights: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
رقم الانضمام: edsbas.B26368D1
قاعدة البيانات: BASE
الوصف
DOI:10.1007/978-3-642-35606-3_20