Parallel Algorithm

Performance analysis of parallel algorithms

Mar 23, 2016

5547

Last Updated on Mar 23, 2016

A parallel system is the combination of an algorithm and the parallel architecture on which it is implemented. In general, major characteristics that affect parallel system performance are clock speed, size and number of registers, number of concurrent paths to memory, instruction issue rate, memory size, ability to fetch/ store vectors (or scalar data) efficiently, number of duplicate arithmetic functional units handling conditional blocks of code and many more.

It is important to analyze the performance of parallel program to determine the best algorithm, evaluating hardware platform, and examining the benefits from parallelism. To evaluate the performance of parallel systems that involve high complexity level, it is necessary to study various metrics such as execution time or execution rate, speedup factor, efficiency, cost, redundancy of computation and utilization factor etc.

Table of Contents

Execution Time:

The serial runtime of a program is the time elapsed between the beginning and the end of its execution on a sequential computer. The parallel runtime is the time that elapses from the moment a parallel computation starts to the moment the last processing element finishes execution. We denote the serial runtime by T_S and the parallel runtime by T_P.

Execution time is categorized by two different categories;

Serial execution time: It is denoted by T(n, 1) where n is the problem size.
Parallel execution time: It is defined as the time elapsed from a moment a parallel computation starts to the moment the last processor completes execution. It is denoted by T(n, p) where p is the number of processors.

Total Parallel Overhead:

The overheads incurred by a parallel program are encapsulated into a single expression referred to as the overhead function. We define overhead function or total overhead of a parallel system as the total time collectively spent by all the processing elements over and above that required by the fastest known sequential algorithm for solving the same problem on a single processing element.

In other words total parallel overhead is the amount of time required to coordinate parallel tasks, as opposed to doing useful work. Following factors of Total Parallel Overhead are as follows;

Task start-up time (like; Delay etc.)
Synchronizations
Data communications
Software overhead imposed by parallel languages, libraries, operating system, etc.
Task termination time

We denote the total overhead function of a parallel system by the symbol T_o.

Let us consider the total time spent in solving a problem summed over all processing elements
= pT_P units.
And the time spent performing useful work (serial time) = T_S units.
Therefore, the overhead function (T_o) is given by

T_o(n, p) = pT_p(n, p) – T_s(n, 1)

Where, p is the number of processors to solve n problem size.

Speedup:

To evaluate the performance gain achieved by parallelizing a given application over a sequential implementation, speedup is the measure that gives this relative benefit of solving a problem in parallel. Speedup of a system is defined as the ratio of the time taken to solve a problem on a single processing element to the time required to solve the same problem on a parallel computer with p identical processing elements. We denote speedup by the symbol S. Observed speedup of a algorithm which has been parallelized, defined as:

T (n, 1) T_sS(n, p) = ———– = ——
T (n, p) T_p

Where, T_s is the execution time taken to perform the computation on one processor and T_p is the execution time needed to perform the same computation using p processors.

Efficiency:

Efficiency is a measure of the fraction of time for which a processing element is usefully employed. Efficiency is defined as the ratio of speedup to the number of processing elements. We denote efficiency by the symbol E. Mathematically, it is given by

E = S / p

Where, S denotes the speedup over p processing elements.

In an ideal parallel system, speedup is equal to p and efficiency is equal to one. In practice, speedup is less than p and efficiency is between zero and one, depending on the effectiveness with which the processing elements are utilized.

Scalability:

Scalability is a characteristic of a system or process to handle a growing amount of work or its potential under an increased or expanding workload. In a parallel system it can maintain constant efficiency E when increasing the number of processors and the size of the problem. In many cases

T_o= f (T_S, p)

and grows sub-linearly with T_S. It can be possible to increase p and T_S and keep E constant through scalability factor measurement. Hence, scalability measures the ability to increase speedup in function of p.

There are two types of scaling based on time to solution:

Strong scaling: The total problem size stays fixed as more processors are added.
Weak scaling: The problem size per processor stays fixed as more processors are added.

Isoeffiency:

Isoeffiency analysis helps to determine the best algorithm/ architecture combination for a particular problem without explicitly analyzing all possible combinations under all possible conditions.

Cost:

The cost of solving a problem on a single processing element is the execution time of the fastest known sequential algorithm. A parallel system is said to be cost optimal if the cost of solving a problem on a parallel computer has the same asymptotic growth as a function of the input size as the fastest known sequential algorithm on a single processing element.

Utilization:

The utilization measure can be defined as the ratio between the actual numbers of operations O(n,p) and the cost of that parallel computation C(n,p) as;

O (n,p) O(n,p)
U (n,p) = ———– = ————-
C(n,p) pT(n,p)

It should be noted that cost is sometimes referred as work or processor-time product.

Also Check: Star Connected Interconnection Network

Performance analysis of parallel algorithms

Execution Time:

Total Parallel Overhead:

Speedup:

Efficiency:

Scalability:

Isoeffiency:

Cost:

Utilization:

LEAVE A REPLY Cancel reply

Popular Posts

Difference Between DFA NFA | NFA Vs DFA automata

How to troubleshoot 3 phase Induction motor : Step by Step...

Definition, Classification of computer programming languages

Digital Logic circuits types, application, advantage and disadvantage

NFA to DFA conversion algorithm with solved example

Hidden Terminal and Exposed Terminal problem and its solution

CFG Solved Example – Contex free grammar to context free language...

Command Line Interface

How to design a 5V / 9 V / 12V Power...

SDLC Incremental Model design phase, applications, advantages and disadvantages

Popular Categories