Modern workflow systems typically rely on static resource allocation that does not change during the execution and is often based on rough user resource estimates. As a result, resource allocation and scheduling may be underprovisioning, which leads to slowing down the execution, or overprovisioning, which causes wasting valuable resources.
Leveraging detailed knowledge regarding a task’s behavior, we can optimize runtime, resource consumption, and execution reliability. For that purpose, an innovative approach to modeling tasks, workflows, and their execution is used as part of the scheduling. Contrary to the typical workflow execution, this modeling is done on sub-task granularity to be able to proactively allocate parts of the resources just in time.
Since the accuracy of the model and the resulting predictions cannot be free of errors/noise, the decision-making for resource allocation and scheduling is repeated continuously during execution using updated execution metrics from monitoring data of the task. After the execution of a task, monitoring data from executions is used to refine the model and reduce the error in future predictions. This, in turn, enables a more efficient execution of the task in the current workflow execution and potentially for other workflows running on different infrastructures using this task.
This chapter presents a comprehensive workflow optimization approach with a focus on data intensive workflows. It integrates past work by the authors on specific aspects of the methodology with novel ideas that will be explored further in future research.