hide
Free keywords:
-
Abstract:
We consider the problem of processing a given number of tasks on a given number of processors as quickly as possible when the processing times of the tasks are variable and not known in advance. The tasks are assigned to the processors in chunks consisting of several tasks at a time, and the difficulty lies in finding the optimal trade off between the processors' load balance, which is favoured by having small chunks, and the total scheduling overhead, which will be the lower the fewer chunks there are. Our studies are motivated by a practical problem from high-performance computing, namely parallel-loop scheduling, for which a large variety of heuristics have been proposed in the past, but hardly any rigorous analysis has been presented to date. Our work is based on a generic approach that covers the whole spectrum of processing time irregularity. This approach does not make any assumptions about task processing times, but instead works with estimated ranges for processing times, one for each chunk size, and a measure for the overall deviation of the actual processing times from these estimates. Our analysis provides a general upper bound applicable for every conceivable setting of these parameters, together with lower bounds showing that no algorithm can do significantly better than the ones we propose. Our general result implies optimal bounds for a whole variety of specific settings, including the modelling of task processing times as independent, identically distributed random variables, which underlies most of the previously existing heuristics. Our results confirm the practicability of certain well-established techniques for parallel-loop scheduling, while, on the other hand, revealing major flaws in other approaches.