Communication Overhead

Next: Termination Detection Up: Overheads in AIDA* Previous: Initial Work Distribution

Communication Overhead

Figure 6: Messages per processor (last iteration only)

Communication is another source of overhead hampering the performance of parallel algorithms, especially those running on massively parallel systems. Fortunately, AIDA* exhibits a very low communication overhead. Starting with 64 processors, one would expect the communication rate to increase by a factor of sixteen when increasing the system size to 1024 processors. However, as can be seen in Figure 6, the actual number of messages increases only by a factor of six. The curves seem to level off with growing system size, which can be explained by an increased likelihood that work_requests are answered in the immediate neighborhood of the idle processor. As a consequence, the average distance between sender and receiver does not increase linearly with the system size.

At the end of an iteration, only a single work_request is sent around the ring to indicate that there is no further work available in the current iteration. All other processors on the ring are informed by an out_of_work message. They directly start the next iteration without asking for further work.

Figure 7: Messages per iteration (1024 procs.)

Moreover, as shown in Figure 7 the communication overhead seems to decrease with increasing search time. This is because the transferred nodes change ownership when being shipped to another processor, thereby constantly improving the global work-load balance. The number of messages that went through a single processor on a 1024-processor system decreases rapidly in the last two iterations. Due to this effect, one can assume an even lower communication overhead in other applications involving more iterations.

Volker Schnecke
Mon Dec 19 17:27:56 MET 1994