Figure 6: Messages per processor (last iteration only)
Communication is another source of overhead hampering the performance of parallel algorithms, especially those running on massively parallel systems. Fortunately, AIDA* exhibits a very low communication overhead. Starting with 64 processors, one would expect the communication rate to increase by a factor of sixteen when increasing the system size to 1024 processors. However, as can be seen in Figure 6, the actual number of messages increases only by a factor of six. The curves seem to level off with growing system size, which can be explained by an increased likelihood that work_requests are answered in the immediate neighborhood of the idle processor. As a consequence, the average distance between sender and receiver does not increase linearly with the system size.
At the end of an iteration, only a single work_request is sent around the ring to indicate that there is no further work available in the current iteration. All other processors on the ring are informed by an out_of_work message. They directly start the next iteration without asking for further work.
Figure 7: Messages per iteration (1024 procs.)
Moreover, as shown in Figure 7 the communication overhead seems to decrease with increasing search time. This is because the transferred nodes change ownership when being shipped to another processor, thereby constantly improving the global work-load balance. The number of messages that went through a single processor on a 1024-processor system decreases rapidly in the last two iterations. Due to this effect, one can assume an even lower communication overhead in other applications involving more iterations.