As we know that the cost of solving a problem on a single processor is the execution time of the fastest known sequential algorithm. A parallel system or a parallel algorithm is said to be cost-optimal when its cost matches the run time of the best known sequential algorithm Ts for the same problem. Or in other words a parallel algorithm is cost optimal if the cost of solving a problem on a parallel computer has the same asymptotic growth (in Θ terms) as a function of the input size as the fastest-known sequential algorithm on a single processing element.
| The run time of the best known sequential algorithm | ≥ | The run time of the parallel algorithm Ts (Cost on parallel algorithm) |
A cost-optimal parallel system has an efficiency of O(1) because the efficiency is the ratio of sequential cost to parallel cost.
Prove that a cost optimal parallel algorithm exists?
If total number of operations performed by the parallel algorithm is of the same complexity class as an optimal sequential algorithm, then a cost optimal parallel algorithm does exist.
For example; consider the example of adding n numbers. Its parallel algorithm performs n/2 additions in first step, n/4 addition in second step, n/8 addition in third step and soon. Thus executes total of (n – 1) additions over ⌈logn ⌉ iterations. Since both the sequential and parallel algorithm for the problem under consideration perform (n – 1) additions, a cost-optimal variant of parallel algorithm exists.
Example 1: Is cost of adding n numbers on n processors is cost optimal?
By the reference of example 2.1 for adding n number on n processing elements has parallel computation time is T(n, p) = O (log n) [in worst case], so the processor-time product will yield the corresponding cost as
C(n, p) = O (n log n) [in worst case].
Since the serial runtime or sequential runtime to operation of adding n numbers is O(n).
By using the condition of cost optimality;
| The run time of the best known sequential algorithm | ≥ | The run time of the parallel algorithm Ts (Cost on parallel algorithm) |
i.e. O(n) ≥ O(n log n)
The above condition is contradictory when we putting the value of n ≥ 11.
Hence, O(n) ≱ O(n log n)
Thus we can say that the cost of adding n numbers on n processors is not cost optimal.
Example 2: Is adding n numbers on p processing elements is cost optimal? When n > p.
Proof: Consider the problem of adding n numbers on p processing elements such that p < n and both n and p are powers of 2. We use the same parallel algorithm as use in Example 2.1 for n processors except in this case simulate n processing elements on p processing elements think them as virtual processors. Each of the p processors is now assigned (n/p) virtual processors. The first (log p) of the (log n) steps of the original algorithm are simulated in [(n/p) log p] steps on p processing elements. Subsequent (log n – log p) steps do not require any communication.
The overall parallel execution time now is Θ ((n/p) log p).
To build it more cost optimal fashion, we have to do the following;
- Each processing element locally adds its n / p numbers in time Θ (n / p).
- The p partial sums on p processing elements can be added in time Θ(n /p).
So the parallel runtime of this algorithm is
Tp = Θ ((n/p) + log p).
And the cost = Θ (p × [(n/p) + log p])
[∵Cost is the product of parallel runtime and the number of processing elements used]
= Θ (n + p log p)
As long as n ≈ Θ (p log p),
Hence the cost of this parallel algorithm = Θ (n + n) = Θ (2n) = Θ (n).
To check the cost optimality of this parallel algorithm;

| The run time of the best known sequential algorithm | ≥ | The run time of the parallel algorithm Ts (Cost on parallel algorithm) |
∴ Θ (n) ≥ Θ (n)
The above condition is true. Thus we can say that the cost of adding n numbers on p processing elements is cost optimal.
Let us consider an example n = 16 and p = 4.
The steps leading to the solution are shown below figure;
These simple examples demonstrate that the manner in which the computation is mapped onto processing elements may determine whether a parallel system is cost-optimal. Note, however, that we cannot make all non-cost-optimal systems cost-optimal by scaling down the number of processing elements.








