Duplication based static scheduling of precedence tasks on heterogeneous multiprocessors

Singh, J.

DSpace Home
→
Ph.D Theses
→
Year-2014
→
View Item

Duplication based static scheduling of precedence tasks on heterogeneous multiprocessors

Singh, J.

URI: http://localhost:8080/xmlui/handle/123456789/751

Date: 2016-12-14

Abstract:

Task scheduling of applications represented by a directed acyclic graph (DAG) on multiprocessors is an NP-HARD problem Garey and Johnson [1979]. The growing requirement to utilize the underlying parallelism for quick execution of applications along with other multiple objectives has engaged the attention of the researchers. This work focuses on static scheduling of applications represented as DAGs on heterogeneous multiprocessor platforms for reducing the schedule length (or makespan) along with energy consumption and meeting hard/soft deadlines. The tasks are heterogeneous in that each task has a potentially diverse execution requirement on each one of the systems processors. The processors have been modeled to be interconnected in a fully connected or restricted topology. In the context of scheduling, duplication or replicating tasks on multiple processors has generally been utilized for reliability and for reducing the impact of communication on the schedule length. By duplicating the heavily communicating jobs on a single processor, the interprocessor communication costs can be minimized, which can reduce the makespan. A comprehensive analysis of the concept of duplication has been performed to reduce communication cost/energy and its overall impact on the scheduling objectives. First, an attempt have been made to improve the convergence of a duplication based Mixed Integer Linear Programming (MILP) formulation to reduce the schedule length. Other known optimal MILPs duplicate a job on all the available processing elements and this increases their complexities. In this solution, a new REStricted Duplication (RESDMILP) approach to model duplication in a MILP has been proposed. The complexity of this model increases with the increase in the amount of duplication. Experiments have revealed that RESDMILP achieves better runtimes when the problem instance is solved optimally and provides better lower bounds and percentage gaps if it is run for a fixed amount of time. The percentage gap is defined as, (UB − LB)/UB where UB and LB are the upper and lower bounds achieved by the MILPs respectively. Next, we study the effect of duplication with respect to minimizing: the makespan, the total energy for processing tasks and messages on processors and network resources respectively, and the tardiness of tasks with respect to their deadlines. Energy effi- ciency along with enhanced performance (i.e., shorter makespan) are two important goals of scheduling on multiprocessors. A Contention-aware, Energy Efficient, Duplication based Mixed Integer Programming (CEEDMIP) formulation has been proposed for scheduling task graphs on heterogeneous multiprocessors, interconnected in a distributed system or a network on chip architecture. Optimizing the use of duplication with MILP provides both energy efficiency and performance by reducing the communication energy consumption and the communication latency. The contention awareness gives a more accurate estimation of the energy consumption. Also, a corner case has been identified that allows the scheduling of a parent task copy after a copy of the child task, which may lead to more efficient schedules. It has been observed that the proposed MILP with a clustering based heuristic (FastCEED) provides scalability and gives 10 − 30 percent improvement in energy with improved makespan and accuracy when compared with other duplication based energy aware algorithms. To further improve the power and performance, duplication has been modeled alongside dynamic voltage/frequency scaling (DVFS). In power aware scheduling with DVFS, tasks are made to run at low voltages, which decreases their computation power. However, it also increases their execution costs and hence, may increase the schedule length. Furthermore, applying DVFS on processors does not impact the communication delay and power consumption. Duplicating a task on multiple processors reduces the communication delay among processors, which further reduces the schedule length and improves the performance. Additionally, duplication reduces the communication energy among processors, also increasing the overall computation energy. This solution integrates DVFS and duplication to schedule task graphs on heterogeneous multiprocessors. The use of both techniques is optimized with an MILP formulation to achieve better power and performance. To enhance the MILP convergence, each task is run by integrating the maximum and minimum voltages on a processor instead of iterating through all the voltage levels. The results demonstrate a minimum of 50% improvement in the processor power and 20 − 50% improvement in the total power (processor and communication) with a schedule length comparable to that of the other algorithms. Finally, we consider the classical real-time scheduling model that consists of realtime periodic task graphs with hard end-to-end deadlines. The system requirement is that all the deadlines must be met. As discussed above, duplicating the predecessor of a task on the processor to which the task is assigned can result in the minimization of the communication cost. This helps in reducing the schedule length. However, this reduction comes at the cost of the extra computing power required to duplicate the tasks. This work addresses this trade-off between duplication and computing power. Some “controlled” duplication based algorithms have been proposed for scheduling real-time periodic tasks with end-to-end deadlines on heterogeneous multiprocessors. It has been observed that the decision regarding whether to duplicate tasks or not is made by the task deadlines. In case the deadline can be met without duplication, more schedule holes are created. These holes can be used by other tasks. A real time MILP and a local search based extension has been proposed. Simulations show that the proposed algorithms efficiently utilize the holes and improve the success ratio by 15% − 50% versus comparable algorithms.

Show full item record