As we know that the cost of solving a problem on a single processor is the execution time of the fastest known sequential algorithm. A parallel system or a parallel algorithm is said to be cost-optimal when its cost matches the run time of the best known sequential algorithm Ts for the same problem. Or in other words a parallel algorithm is cost optimal if the cost of solving a problem on a parallel computer has the same asymptotic growth (in Θ terms) as a function of the input size as the fastest-known sequential algorithm on a single processing element.

 The run time of the best known sequential algorithm ≥ The run time of the parallel algorithm Ts (Cost on parallel algorithm)

A cost-optimal parallel system has an efficiency of O(1) because the efficiency is the ratio of sequential cost to parallel cost.

## Prove that a cost optimal parallel algorithm exists?

If total number of operations performed by the parallel algorithm is of the same complexity class as an optimal sequential algorithm, then a cost optimal parallel algorithm does exist.

For example; consider the example of adding n numbers. Its parallel algorithm performs n/2 additions in first step, n/4 addition in second step, n/8 addition in third step and soon. Thus executes total of (n – 1) additions over ⌈log⁡n ⌉ iterations. Since both the sequential and parallel algorithm for the problem under consideration perform (n – 1) additions, a cost-optimal variant of parallel algorithm exists.

### Example 1: Is cost of adding n numbers on n processors is cost optimal?

By the reference of example 2.1 for adding n number on n processing elements has parallel computation time is T(n, p) = O (log n) [in worst case], so the processor-time product will yield the corresponding cost as

C(n, p) = O (n log n) [in worst case].

Since the serial runtime or sequential runtime to operation of adding n numbers is O(n).
By using the condition of cost optimality;

 The run time of the best known sequential algorithm ≥ The run time of the parallel algorithm Ts (Cost on parallel algorithm)

i.e. O(n) ≥ O(n log n)

The above condition is contradictory when we putting the value of n ≥ 11.

Hence, O(n) ≱ O(n log n)

Thus we can say that the cost of adding n numbers on n processors is not cost optimal.

### Example 2: Is adding n numbers on p processing elements is cost optimal? When n > p.

Proof: Consider the problem of adding n numbers on p processing elements such that p < n and both n and p are powers of 2. We use the same parallel algorithm as use in Example 2.1 for n processors except in this case simulate n processing elements on p processing elements think them as virtual processors. Each of the p processors is now assigned (n/p) virtual processors. The first (log p) of the (log n) steps of the original algorithm are simulated in [(n/p) log p] steps on p processing elements. Subsequent (log n – log p) steps do not require any communication.

The overall parallel execution time now is Θ ((n/p) log p).

To build it more cost optimal fashion, we have to do the following;

1. Each processing element locally adds its n / p numbers in time Θ (n / p).
2. The p partial sums on p processing elements can be added in time Θ(n /p).

So the parallel runtime of this algorithm is

Tp = Θ ((n/p) + log p).

And the cost = Θ (p × [(n/p) +  log p])

[∵Cost is the product of parallel runtime and the number of processing elements used]

= Θ (n + p log p)

As long as n ≈ Θ (p log p),

Hence the cost of this parallel algorithm = Θ (n + n) = Θ (2n) = Θ (n).
To check the cost optimality of this parallel algorithm;