# DOEACC Society 2006 DOEACC B Level BE5 Parallel Computing ( ) - Question Paper

Friday, 14 June 2013 04:10Web

BE5-R3: PARALLEL COMPUTING

NOTE:

**Time:**three Hours Total

**Marks:**100

1.

**a)**Why is there no use in increasing the number of processors beyond a certain point in a

multiprocessor system?

**b)**define the multiple cooperative masters OS model for Unix on multiple processors.

**c)**Compare the main techniques involved in domain decomposition and control

decomposition.

**d)**What is the latency for wormhole routing?

**e)**What is prefetching and what benefits do we get from it?

**f)**discuss the situations under which cache inherence issue can arise.

g) discuss the subsequent kinds of dependencies ranging from instructions:

**i)**Antidependence

**i**Output dependence

**i)****i**Resource dependence

**i****i)**(7x4)

**2.**select amongst the subsequent architectures for the applications provided beneath. Also,

suggest possible communication mechanisms.

- Asymmetric UMA

- Multiple Instruction Single Data (MISD)

- Hierarchical cluster NUMA architecture.

- Single Instruction Multiple Data (SIMD)

provide proper justification for your architectural option.

**i)**A database application has several modules providing specialized functions. a few of

the modules are tightly coupled together, share data and communicate heavily with every

other. Such a close group of modules interacts regularly with other similar sets of

modules. Groups of such closely interacting sets of modules occasionally communicate

with every other.

**i**For weather forecasting, the environmental space is represented as a grid of 3-

**i)**dimensional sub spaces. The weather parameters (temperature, relative humidity, dust

concentration et

**c)**for every sub-space are collected in the form of arrays and

manipulated by different operations to predict the weather conditions such as rainfall,

possible storms etc.

**i**A Protein structure analysis application performs computational simulations using

**i****i)**voluminous data on the protein molecular composition, to analyze the 3D protein

structures. The calculations are similar and involve several nested DO loops.

(6x3)

BE5-R3 Page one of three July, 2006

**1.**ans ques. one and any 4 ques. from two to 7.

**2.**Parts of the identical ques. should be answered together and in the identical

sequence.

3.

**a)**Write a parallel pseudocode for performing an even-odd transportation sort on a linear

array of n processors. Show the calculation time and the communication time at every

step. What is the overall time complexity? Illustrate the sorting process for a sequence of

8 numbers: 3,1, 9, 7,5,2,0,6

**b)**Write the pseudocode for performing a shuffle of n data items kept in a linear array of n

processors. What is the time complexity? Illustrate the shuffle process for a sequence

Z,X,F,G,T,H,U,J.

**c)**Write the pseudocode for PI computation on a multiprocessor system of p processors

using the approximation formula: = P= W*S(i=1,,N) 4/(1+(xi)2), where the PI area is described

ranging from 0 and 1, N is the total number of intervals ranging from 0 and one and W is the width

of every rectangle W=1/N. Is there ant need for synchronization in this algorithm?

(6+6+6)

4.

**a)**describe the subsequent terms:

**i)**Bisection width of a network

**i**Perfect shuffle operation

**i)****i**Perfect inverse shuffle

**i****i)****i**Scalability

**v)****v)**Network latency

**v**K-ary n-cube networks

**i)****v**Network throughput

**i****i)****v**Node degree

**i****i****i)****b)**Compare Multistage Networks with Crossbar switch in terms of wiring complexity, minimum

latency and routing capability.

([8x1.5]+6)

5.

**a)**A program has only 2 modes of operation; purely sequential mode for 40% of the

program and fully parallel for the remaining program. The program is run on a

multiprocessor system in which the total number of processors n is much greater than

the maximum degree of parallelism of the program m (n>>m). compute the percentage

increase in speedup performance of the multiprocessor system when the number of

processors is increased from four to 10 for the subsequent models, ignoring all system

overheads.

**i)**Fixed workload model

**i**Fixed execution time model

**i)****i**Memory bound model. presume that the workload is increased by 25% more than

**i****i)**the maximum available parallelism, when memory size is increase

**d.**Thus the

workload is increased five times when the maximum number of processors is four and

increased 12.5 times when the number of processors is 10.

**b)**explain how the communication overheads may offset the advantages of parallel

processing.

([3x5]+3)

BE5-R3 Page two of three July, 2006

6.

**a)**Suggest methods of vectorizing the subsequent DO loops:

**i)**Do 30 I = 1,N

A

**(**= 2*C(I)+B(I)

**I)**30 C(I+1)=B(I)+A(I+2)

**i**DO 20 K = two , N

**i)**DO 30 L= 2, N

DO 40 M= 1, N-1

A(K,L,M) = (A(K, L-1,M) + A(K,L+1,M))/2

40 continue on

30 continue on

20 continue on

**i**DO 20 K = 1,N

**i****i)**A[K] = B(K,1)+ C(K,1)

DO 30 L = 2, N+1

D(K) = D(K) + B(K,L)*C(K,L)

30 continue on

DO 40 M=1,N+1

E(K+

**1)**= E(K)+B(K,M)

40 continue on

20 continue on

**b)**Compare the distributed memory model and the shared memory model for parallel

programming in terms of different parameters.

**c)**What is symmetric multiprocessor? explain about its advantages.

([2+3+3]+6+4)

7.

**a)**Draw the schematic for a hierarchical bus based architecture with distributed caches for a

multiprocessor system and defines the working of the system. elaborate the advantages

of having a hierarchy of busses?

**b)**Differentiate ranging from data flow and control flow processing approaches.

(9+9)

BE5-R3 Page three of three July, 2006

Earning: Approval pending. |