1-introduction.tex

\section{Introduction}

%\begin{itemize}
%	\item Motivation: Why load balancing
%	\item Context: Scenarios in which global rescheduling applies
%	\item Problem: Scaling iterative applications: algorithm limitations
%	\item Justification: Scalable load balancing, distributed rescheduling
%	\item Related work: Hierarchical (limited scaling), Distributed (excessive comm), Diffusive (iterative), Refinement Based (centralized)
%	\item Proposed solution: Batch task migration to mitigate communication costs
%	\item Paper contributions	
%	\item Results brief
%	\item Paper structure
%\end{itemize}

%Growth of systems and applications.
%Need to achieve high scalability to use platforms.
%Efficient scheduling and use of resources.
%
%Dynamic iterative applications.
%Unable to use work stealing.
%Global rescheduling as a solution.
%
%Strong scaling of applications.
%Cost of centralized load balancing.
%Using platform parallelism to achive performance.
%Communication costs as a relevant overhead.
%
%Present hierarchical LB.
%Pros: weak scaling, topology.
%Cons: high comms, bottlenecks.
%Present distributed LB.
%Pros: parallel, refinement.
%Cons: high comms, network contention.
%
%\textbf{Cut out topo aware and diffusive, not related enough.}
%
%Brief intro to BTM.
%Reduced communication.
%Highly parallel.
%Fast convergence.
%
%Present contributions.
%
%Brief results.
%
%Present paper structure.
%
%\hline

Parallel machines are at their best when the workload is evenly distributed among compute nodes, and idle time is merely a myth.
Unfortunately, strong scaling applications for these platforms has been a challenge as long as they have existed.
In this context, uneven workload distribution and high communication overheads are the main villains when developing parallel applications~\cite{Deveci2015}.
Concerns towards these problems increase as systems grow in size and performance, consuming more resources, specially power, to solve some of the most complex problems in scientific computing~\cite{exapower2015,padoin2017energy}.

Applications such as simulations of molecular dynamics (MD) and hydrodynamics suffer from load imbalance due to their intrinsic dynamic and iterative nature~\cite{namd,IPDPS13:LULESH}.
Although rescheduling algorithms have been able to greatly improve the performance of these applications~\cite{namd}, new approaches are needed to guarantee their performance as parallel systems grow.
Since mapping tasks to processing elements (PEs) is an NP-Hard problem~\cite{npcomplete}, the increase in application data and platform size makes centralized rescheduling approaches inefficient.
This creates a need for scalable and decentralized rescheduling approaches, avoiding both excess of data to process and network contention~\cite{trahay2009scalable}.

The two main paths to achieve scalability in global rescheduling for iterative applications are \textit{hierarchical} and \textit{distributed} approaches.
Hierarchical load balancing explores parallelism using different approaches for fine-grain and coarse-grain steps~\cite{hybrid}.
Although scalable, hierarchical schedulers are often tied to the same limitations of centralized approaches, as data is still aggregated in parent nodes.
On the other hand, distributed load balancing approaches seek to achieve scalability through multiple smaller, decentralized scheduling decisions.
Albeit more scalable than hierarchical approaches, decentralized schedulers have limited system information and often incur in higher amounts of communication.

Despite their notable effectiveness, few are the distributed strategies in the domain of global rescheduling~\cite{diffus,grapevine}.
In this paper, we present the concept of \textit{Batch Task Migration} and a novel distributed global rescheduling algorithm that applies this technique, called \packdrop.
Our approach is based on the idea of grouping tasks in batches prior to migration decisions, decreasing the communication overhead in the scheduling decision time and preserving the locality of migrated tasks~\cite{Paudel2013wslocal}.

% Main contributions
In this paper, we present the following contributions: 
\begin{enumerate}
	\item A highly scalable \textit{Batch Task Migration} approach for distributed rescheduling algorithms;
	\item A novel distributed rescheduling algorithm, \packdrop, using our \textit{Batch Task Migration} approach;
	\item An implementation of our algorithm in a well-known parallel programming model as well as a performance evaluation of this implementation.
\end{enumerate}

The remainder of this paper is divided as follows.
Section~\ref{sec:rw} discusses recent work in dynamic rescheduling of scientific applications. 
Section~\ref{sec:algo} presents our novel approach and the developed algorithms. 
Section~\ref{sec:analysis} presents a complexity analysis of our distributed algorithm (\packdrop). 
Section~\ref{sec:impl} displays implementation details, execution environments and benchmarks used in this paper. 
Section~\ref{sec:eval} presents our performance evaluation methodology and discusses the obtained results. 
Finally, Section~\ref{sec:conclusion} presents the conclusion of this work and our plans for future research.