next up previous index
Next: Exercises Up: The Random Pie Previous: The Code

The Discussion

Like all other MPI programs, this program also begins with the usual incantations. But here we are going to create a new communicator by manipulating the default one, so we copy the default communicator, which is an MPI constant, onto a variable of type MPI\_Comm , world:

  MPI_Init(&argc, &argv);
  world = MPI_COMM_WORLD;
  MPI_Comm_size(world, &numprocs);
  MPI_Comm_rank(world, &myid);
We are going to exclude one process from the pool, and this process will be employed to generate random numbers and then send them to other workers. We are going to use a process with the highest rank number for this:
  server = numprocs - 1;
In the next line we employ process of rank 0 to read the value of $\epsilon$ from the command line and to broadcast it to all other processes:
  if (myid == 0)
    sscanf( argv[1], "%lf", &epsilon);
  MPI_Bcast(&epsilon, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
Now we commence the communicator manipulations. The communicator comprises a number of elements. First, there are the processes themselves. The processes when abstracted from the communicator form a group. But there is more to a communicator than just a group of processes that enter it. A communicator may order the processes in a special way, e.g., to form a Cartesian grid, as we have seen before, and a communicator arranges for separate communication channels between its processes too. Messages are differentiated by the means of tag numbers, recepient numbers, sender numbers and the communicator within which they are transmitted. But here it is the group of processes that we want to reform. And so we begin by calling  MPI_Comm_group, which is going to extract the group of processes from the existing communicator world and deposit it on a variable of type MPI_Group  here called worker_group:
  MPI_Comm_group( world, &world_group);
Type MPI_Group is an opaque type. It is not an array, which you could manipulate directly. In order to manipulate variables of this type, you have to call special MPI functions that deal with groups of processes. Here we want to exclude the random number server process from the group. And here is how it's done:
  ranks[0] = server;
  MPI_Group_excl(world_group, 1, ranks, &worker_group);
Function  MPI_Group_excl takes an existing group, here it is world_group. Then it takes an array of ranks, here called ranks of length specified by the second argument, here it is 1. And finally it creates a new group of processes, here it is called worker_group, by excluding processes listed in the array ranks from the world_group group.

So this is an example of how you can manipulate process groups. Now we need to make this new group, worker_group into a communicator. This is done by calling  MPI_Comm_create. This function takes an existing communicator from which processes that form the worker_group have been extracted and then creates a new communicator, here called workers, that contains processes of the worker_group only.

  MPI_Comm_create(world, worker_group, &workers);
Once the communicator has been created, we have no more need for the group worker_group itself. So we can free the storage that's been allocated for the worker_group variable by calling function  MPI_Group_free
  MPI_Group_free(&worker_group);
We could free right here and now the other group world_group too, since the program does not make any more use of it.

Processes within the new communicator called workers may have different rank numbers, but the variable myid has been established within the original MPI_COMM_WORLD and only one process is going to have myid == server and this is exactly the process that has been excluded from workers. Whenever workers will communicate with the server they will have to send messages within the world communicator. Whenever they want to communicate just amongst themselves, they will have to send messages within the workers communicator.

Now the program splits into two parts. The first part that corresponds to the first clause of the if statement describes the server's activities, the second part describes the workers' activities.

The server enters a loop, within which it waits for a request, i.e., a message that may arrive from MPI_ANY_SOURCE, with the tag number set to REQUEST sent to it within the world communicator. The message will contain just one integer. If the integer is greater than zero (this means TRUE in the C-language) the server process is going to generage CHUNKSIZE of random integers between 0 and INT_MAX (which is defined on limits.h). The integers will be written on rand, which is then going to be sent back to the requesting process of rank stat.MPI_SOURCE with the tag number set to REPLY. The communication still takes place within the world communicator, because the server does not belong to any other communicator.

The server is going to loop until someone sends it request == 0. Sending request == 0 to the server terminates the server, which then goes right to MPI_Finalize to wait for other processes.

  if(myid == server) {
    do {
      MPI_Recv(&request, 1, MPI_INT, MPI_ANY_SOURCE, REQUEST, world, &stat);
      if (request) {
        for (i = 0; i < CHUNKSIZE; i++)
          rands[i] = random();
        MPI_Send(rands, CHUNKSIZE, MPI_INT, stat.MPI_SOURCE, REPLY, world);
      }
    }
    while (request> 0);
  }

The worker's life is a little more complicated. The workers are the ones who have to make various decisions in this program. They being their lives by sending a request to the server for a string of random integers.

  else {
    request = 1;
    done = in = out = 0;
    max = INT_MAX;
    MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);

Then they find about their rank numbers within the workers communicator. But in this program they are not going to make any use of it, so we could just as well skip this call to MPI_Comm_rank:

    MPI_Comm_rank(workers, &workerid);

Now the iteration process begins. The iterations are going to be repeated until the job is done, i.e., until $\pi$has been evaluated with precision better than $\epsilon$ or until the total number of sand grains thrown at the square has exceeded TOTAL. Each iteration begins by receiving a string of random integers from the server:

    iter = 0;
    while(!done) {
      iter++;
      request = 1;
      MPI_Recv(rands, CHUNKSIZE, MPI_INT, server, REPLY, world, &stat);
Having received the numbers, the worker process ``throws'' them at the square. We take two random integers from the square. Then we renormalize them to reals that fit between -1 and 1 and we call them x and y. Then the process checks if the sand grain so produced hits the circle or the part of the square that's outside by checking if x2 + y2 < 1. If we have scored the hit, we increment the in counter, if we have missed we increment the out counter.
      for (i=0; i < CHUNKSIZE; ) {
        x = (((double) rands[i++])/max) * 2 - 1;
        y = (((double) rands[i++])/max) * 2 - 1;
        if (x*x + y*y < 1.0)
          in++;
        else
          out++;
      }
Each process could evaluate pi on its own, but the evaluation will be more accurate if they can add all their scores. This is done by calling the MPI_Allreduce function , which is like MPI_Reduce, which we have encountered already, but with the difference that the result of the reduction operation is distributed to all participating processes. Remember that in MPI_Reduce only the root process had the number.
      MPI_Allreduce(&in, &totalin, 1, MPI_INT, MPI_SUM, workers);
      MPI_Allreduce(&out, &totalout, 1, MPI_INT, MPI_SUM, workers);
This communication takes place within the workers communicator. This is the only place where the workers communicate with each other. Because the communication is always collective, they don't need to know their rank numbers within the workers communicator.

Now we can evaluate our estimate of $\pi$. totalin + totalout is the total number of sand grains thrown at the square, whereas totalin is the total number of sand grains that have landed within the circle:

      Pi = (4.0*totalin)/(totalin + totalout);
Since we know from elsewhere what's the value of $\pi$, we can easily evaluate the accuracy of our Monte Carlo experiment:
      error = fabs( Pi-3.141592653589793238462643);
and use this result in checking if the job is done:
      done = ((error < epsilon) || ((totalin+totalout) > TOTAL));
Now we use this funny C-language construct, which encrypts if(done) request=0; else request=1; and process of rank zero sends the request to the server after it has printed the value of $\pi$ on standard output. This value is printed in such a way that it overwrites the previously printed value: the message begins with the carriage return and there is no new-line.

The reason why process of rank zero sends the request to the server separately from other processes is because the request may be a termination request, in which case other processes should not send their termination requests. There wouldn't be anyone at the server's end to receive them. The other processes send the request to the server only if there is more work to be done:

      request = (done) ? 0 : 1;
      if (myid == 0) {
        printf( "\rpi = %23.20lf", Pi);
        MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
      }
      else {
        if (request)
          MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
      }
    }
And this ends the while(!done) loop. There is one more thing that the worker processes are going to do before they quit. They terminate their communicator by calling  MPI_Comm_free
    MPI_Comm_free(&workers);
  }
Having released themselves from the workers communicator they go to MPI_Finalize to wait for each other. In the meantime process of rank 0 (within the world communicator) prints the total number of sand grains thrown at the square on standard output.
  if (myid == 0) 
    printf( "\npoints: %d\nin: %d, out: %d\n", 
            totalin+totalout, totalin, totalout);
  MPI_Finalize();
  exit (0);
}


next up previous index
Next: Exercises Up: The Random Pie Previous: The Code
Zdzislaw Meglicki
2004-04-29