Architectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service


In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflow engine to its state before the invocation of the faulty web service. To achieve this, three components for fault diagnosis, recording the execution state of the workflow and substitution of faulty web services, at run time, are considered in our proposed architecture. The applicability of the proposed architecture is practically evaluated by applying it to design of three different scientific workflow engines.