User Guide
Why can I only view 3 results?
You can also view all results when you are connected from the network of member institutions only. For non-member institutions, we are opening a 1-month free trial version if institution officials apply.
So many results that aren't mine?
References in many bibliographies are sometimes referred to as "Surname, I", so the citations of academics whose Surname and initials are the same may occasionally interfere. This problem is often the case with citation indexes all over the world.
How can I see only citations to my article?
After searching the name of your article, you can see the references to the article you selected as soon as you click on the details section.
 Views 14
An Evaluation of Major Fault Tolerance Techniques Used on High Performance Computing (HPC) Applications
2023
Journal:  
International Journal of Intelligent Systems and Applications in Engineering
Author:  
Abstract:

Abstract High performance computing have a high number of constituent components used to facilitate data movement. Key characteristics of these systems include parallel processing, large memory, multiprocessor or multimode communication, and parallel file systems. Though they can turnaround computing in scenarios that need maximum processing power, HPCs face many challenges, key among them being fault tolerance. Today, most applications deal with faults by noting checkpoints frequently. Whenever a fault occurs, all the processes are terminated, and the task is loaded once again from the last checkpoint. Most applications deal with faults by noting checkpoints frequently. Whenever a fault occurs, all the processes are terminated, and the task is loaded once again from the last checkpoint. Key fault tolerance techniques used on HPC applications (reactive and proactive) were evaluated in this paper. Reactive protocols discussed include checkpointing/ restarting, replication, retry, and SGuard, while proactive techniques include preemptive migration, software rejuvenation, and self-healing strategy. As seen from the discussion on the drawbacks of each approach, efficient management of faults can best be achieved by using a hybrid system applying proactive and reactive measures simultaneously.

Keywords:

Citation Owners
Information: There is no ciation to this publication.
Similar Articles












International Journal of Intelligent Systems and Applications in Engineering

Field :   Mühendislik

Journal Type :   Uluslararası

Metrics
Article : 1.632
Cite : 489
2023 Impact : 0.054
International Journal of Intelligent Systems and Applications in Engineering