Abstract:
Workflow-based parameter-sweep applications are an important class of parallel jobs on clusters and grid today. Conventional batch schedulers and parameter study tools are not effective for this type of application. Especially, their scheduling policies are usually designed to minimize the makespan of the whole parameter study. However, many parameter-sweep applications also have a primary objective to obtain the best or a few top-ranked results from a large parameter space. This research describes a new heuristic for scheduling parameter-sweep workflows in order to minimize the turnaround time of the workflows that give the best results. The algorithm is based on dynamically adjusted priority according to intermediate data obtained at some stage in the workflow. The technique is applied on a high-throughput drug screening application. The experimental results show that our technique can significantly improve the correlation between the ranking of the final results and the order of completion of the workflows.