Past Issues

Studies in Informatics and Control
Vol. 22, No. 1, 2013

Fault Tolerance for Conjugate Gradient Solver Based on FT-MPI

Weizhe ZHANG, Hui HE
Abstract

Grid computing is characterized by high speed, large scale, large task quantity, and long cycles. Such characteristics prevent the waste of large amounts of computing power and time that can be attributed to system errors. Moreover, such features provide the fault tolerance of computing resource nodes in the structural system of grid computing, which has become a key issue in the field. This paper describes the current fault-tolerant message passing interface library, designs a grid computing-based task migration and recovery model, and then identifies the functional architecture of each module of the mode. Further analysis and comparison were conducted on the storage mechanism of the fault-tolerant checkpoint of the model as well as its information-encoding algorithm. Finally, the realization of a Checksum algorithm-based fault-tolerant conjugate gradient solver shows the validity of the theory.

Keywords

fault tolerance; CG solver; FT-MPI; computational grid.

View full article