Exploring the Potential of Carbon-Aware Execution for Scientific Workflows

Abstract

Scientific workflows are widely used to automate scientific data analysis and often involve processing large quantities of data on compute clusters. As such, their execution tends to be long-running and resource intensive, leading to significant energy consumption and carbon emissions.
Meanwhile, a wealth of carbon-aware computing methods have been proposed, yet little work has focused specifically on scientific workflows, even though they present a substantial opportunity for carbon-aware computing because they are inherently delay tolerant, efficiently interruptible, and highly scalable.
In this study, we demonstrate the potential for carbonaware workflow execution. For this, we estimate the carbon footprint of two real-world Nextflow workflows executed on cluster infrastructure. We use a linear power model for energy consumption estimates and real-world average and marginal CI data for two regions. We evaluate the impact of carbonaware temporal shifting, pausing and resuming, and resource scaling. Our findings highlight significant potential for reducing emissions of workflows and workflow tasks.

Publication
2025 IEEE 25th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Fabian Lehmann
Fabian Lehmann
Ph.D. candidate

My research interests include distributed systems, scientific workflows, and their scheduling.