A parallel system is the combination of an algorithm and the parallel architecture on which it is implemented. In general, major characteristics that affect parallel system performance are clock speed, size and number of registers, number of concurrent paths to memory, instruction issue rate, memory size, ability to fetch/ store vectors (or scalar data) efficiently, number of duplicate arithmetic functional units handling conditional blocks of code and many more.
It is important to analyze the performance of parallel program to determine the best algorithm, evaluating hardware platform, and examining the benefits from parallelism. To evaluate the performance of parallel systems that involve high complexity level, it is necessary to study various metrics such as execution time or execution rate, speedup factor, efficiency, cost, redundancy of computation and utilization factor etc.
Execution Time:
The serial runtime of a program is the time elapsed between the beginning and the end of its execution on a sequential computer. The parallel runtime is the time that elapses from the moment a parallel computation starts to the moment the last processing element finishes execution. We denote the serial runtime by TS and the parallel runtime by TP.
Execution time is categorized by two different categories;
- Serial execution time: It is denoted by T(n, 1) where n is the problem size.
- Parallel execution time: It is defined as the time elapsed from a moment a parallel computation starts to the moment the last processor completes execution. It is denoted by T(n, p) where p is the number of processors.
Total Parallel Overhead:
The overheads incurred by a parallel program are encapsulated into a single expression referred to as the overhead function. We define overhead function or total overhead of a parallel system as the total time collectively spent by all the processing elements over and above that required by the fastest known sequential algorithm for solving the same problem on a single processing element.
In other words total parallel overhead is the amount of time required to coordinate parallel tasks, as opposed to doing useful work. Following factors of Total Parallel Overhead are as follows;
- Task start-up time (like; Delay etc.)
- Synchronizations
- Data communications
- Software overhead imposed by parallel languages, libraries, operating system, etc.
- Task termination time
We denote the total overhead function of a parallel system by the symbol To.
Let us consider the total time spent in solving a problem summed over all processing elements
= pTP units.
And the time spent performing useful work (serial time) = TS units.
Therefore, the overhead function (To) is given by
To(n, p) = pTp(n, p) – Ts(n, 1)
Where, p is the number of processors to solve n problem size.
Speedup:
To evaluate the performance gain achieved by parallelizing a given application over a sequential implementation, speedup is the measure that gives this relative benefit of solving a problem in parallel. Speedup of a system is defined as the ratio of the time taken to solve a problem on a single processing element to the time required to solve the same problem on a parallel computer with p identical processing elements. We denote speedup by the symbol S. Observed speedup of a algorithm which has been parallelized, defined as:
T (n, 1) Ts
S(n, p) = ———– = ——
T (n, p) Tp
Where, Ts is the execution time taken to perform the computation on one processor and Tp is the execution time needed to perform the same computation using p processors.
Efficiency:
Efficiency is a measure of the fraction of time for which a processing element is usefully employed. Efficiency is defined as the ratio of speedup to the number of processing elements. We denote efficiency by the symbol E. Mathematically, it is given by
E = S / p
Where, S denotes the speedup over p processing elements.
In an ideal parallel system, speedup is equal to p and efficiency is equal to one. In practice, speedup is less than p and efficiency is between zero and one, depending on the effectiveness with which the processing elements are utilized.
Scalability:
Scalability is a characteristic of a system or process to handle a growing amount of work or its potential under an increased or expanding workload. In a parallel system it can maintain constant efficiency E when increasing the number of processors and the size of the problem. In many cases
To= f (TS, p)
and grows sub-linearly with TS. It can be possible to increase p and TS and keep E constant through scalability factor measurement. Hence, scalability measures the ability to increase speedup in function of p.
There are two types of scaling based on time to solution:
- Strong scaling: The total problem size stays fixed as more processors are added.
- Weak scaling: The problem size per processor stays fixed as more processors are added.
Isoeffiency:
Isoeffiency analysis helps to determine the best algorithm/ architecture combination for a particular problem without explicitly analyzing all possible combinations under all possible conditions.
Cost:
The cost of solving a problem on a single processing element is the execution time of the fastest known sequential algorithm. A parallel system is said to be cost optimal if the cost of solving a problem on a parallel computer has the same asymptotic growth as a function of the input size as the fastest known sequential algorithm on a single processing element.
Utilization:
The utilization measure can be defined as the ratio between the actual numbers of operations O(n,p) and the cost of that parallel computation C(n,p) as;
O (n,p) O(n,p)
U (n,p) = ———– = ————-
C(n,p) pT(n,p)
It should be noted that cost is sometimes referred as work or processor-time product.
Also Check: Star Connected Interconnection Network