The program begins with the usual MPI incantations that result in
every process finding about its place within the pool and about
the size of the pool of processes. A process whose rank is
MASTER_RANK assumes the ``leadership'' of the pool,
which in this program only means that it's going to write stuff
on standard output.
MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &pool_size); MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); if (my_rank == MASTER_RANK) i_am_the_master = TRUE;Now we find out how many particles each process will have to deal with. This is simply going to be
MAX_PARTICLES/pool_size.
If MAX_PARTICLES does not divide by pool_size exactly,
we're going to work with fewer particles in this program. In more
elaborate codes, you may wish to give some processes more particles.
This code uses tools that allow for non-uniform particle
distribution amongst processes.
particle_number = MAX_PARTICLES / pool_size;
if (i_am_the_master)
printf ("%d particles per processor\n", particle_number);
The next step is to define a new MPI data type, so that
we send and receive the whole particle as an individual
data item. A particle in this code is defined as a structure:
typedef struct {
double x, y, z, vx, vy, vz, ax, ay, az, mass, charge;
} Particle;
whose slots are position coordinates, x, y, z, velocity in xy and z direction,
vx, vy, vz, accelerations in x, y and
z directions,
ax, ay, az, mass and charge. All particles
are arranged into an array:Particle particles[MAX_PARTICLES]; /* Particles on all nodes */How are we to tell MPI what a
Particle is? In order to do this
we must know how the C-language is going to store the structure.
It would be good if we could simply pass Particle to MPI
as an argument, but Particle is not a variable. It is a type,
and types cannot be passed to functions as variables.
In this program we assume that C is going to store all components
of structure Particle contiguously (if it doesn't, the program
is going to crash - this is the problem with MPI: it's too low level).
The components are all
of the same type double and there are 11 of them. So
we tell MPI, by calling function
MPI_Type_contiguous,
that we are going to operate on chunks of 11
contiguously laid out doubles
and we want MPI to store its own type description of a chunk
like this on a
variable called particle_type, which is of type
MPI_Datatype:
MPI_Type_contiguous ( 11, MPI_DOUBLE, &particle_type ); MPI_Type_commit ( &particle_type );Once an MPI type has been defined, it must be committed by calling MPI_Type_commit. This call gives the MPI system a chance to compile this information on-the-fly, so that it can be used efficiently in communication calls that follow.
Now we encounter a new fancy communication between processes. We call function MPI_Allgather to tell each process how many particles each other process is going to look after.
MPI_Allgather ( &particle_number, 1, MPI_INT, counts, 1, MPI_INT,
MPI_COMM_WORLD );
This call works as follows. Every process contributes an array given
by the first argument. The array comprises a number of elements
given by the second argument and of type given by the third argument.
These arrays are then put together to form a new array given by the
fourth argument. In this operation the types of individual data items
in the contributed arrays may be reinterpreted (cast) and this, in turn,
may also change the number of items that are received from each process.
The number of received
items is therefore given in the fifth argument and the type of the received
items is given in the sixth argument. I don't know how much call there is
for such re-interpretations of transmitted data and I wouldn't use this
trick in my own programs, but MPI provides for it, if the programmer
wants to play with fire. Observe that MPI_Allgather fills the
array counts (in this case) for all participating processes.
There is another similar function called
MPI_Gather
which performs a gather operation like MPI_Allgather but
leaves the result on a root process only. Function
MPI_Gather has one more argument, the rank of the root process, which
is placed just before the communicator.
In this specific program every process ends up looking after the same number
of particles, so it is not strictly necessary to call MPI_Allgather,
but remember that I promised that the infrastructure of the program
would let you assign different numbers of particles to processes. If these
numbers were to be arrived at by some tricky algorithm that each process would
perform for itself and that would yield different numbers of particles
for each process, we would need MPI_Allgather to send this information
to all other processes.
In the following section of the code we find about the portions of
array particles that each process is going to be responsible
for. We construct an array called displacements. For
process of rank 0 its first particle is going to be
particles[displacements[0]]. For process of rank 1 its first particle
is going to be particles[displacements[1]] and so on. The
specific displacement for ``me'' is going to be called my_offset
and it is simply displacements[my_rank]:
displacements[0] = 0;
for (i = 1; i < pool_size; i++)
displacements[i] = displacements[i-1] + counts[i-1];
total_particles = displacements[pool_size - 1]
+ counts[pool_size - 1];
if (i_am_the_master)
printf ("total number of particles = %d\n", total_particles);
my_offset = displacements[my_rank];
Here is the example of calling
function MPI_Gather. Every process sends a one-item long
array to the root process. The array contains simply my_offset.
The root process assembles these contributions into an array offsets,
which is then printed on standard output.
MPI_Gather ( &my_offset, 1, MPI_INT, offsets, 1, MPI_INT, MASTER_RANK,
MPI_COMM_WORLD );
if (i_am_the_master) {
printf ("offsets: ");
for (i = 0; i < pool_size; i++)
printf ("%d ", offsets[i]);
printf("\n");
}
This part of the code doesn't really do anything important, other than
illustrating the use of MPI_Gather.
Now each process initializes its own particles. It seeds the random number generator with its rank number and then initializes x, yand z coordinates of its particles to random double precision numbers between 0 and 1.
srand48((long) (my_rank + 28));
/* Here each process initializes its own particles. */
for (i = 0; i < particle_number; i++) {
particles[my_offset + i].x = drand48();
particles[my_offset + i].y = drand48();
particles[my_offset + i].z = drand48();
particles[my_offset + i].vx = 0.0;
particles[my_offset + i].vy = 0.0;
particles[my_offset + i].vz = 0.0;
particles[my_offset + i].ax = 0.0;
particles[my_offset + i].ay = 0.0;
particles[my_offset + i].az = 0.0;
particles[my_offset + i].mass = 1.0;
particles[my_offset + i].charge = 1.0 - 2.0 * (i % 2);
}
Now the real computation begins. The first thing we do is to
initialize timers. We are going to measure the total wall clock time
used by the program to perform its N_OF_ITERATIONS time
steps and, separately, time spent communicating, i.e., time spent
within the MPI function calls.
start_time = MPI_Wtime(); comm_time = 0.0;The iteration loop begins with a timed call to function MPI_Allgatherv.
for (count = 0; count < N_OF_ITERATIONS; count++) {
if (i_am_the_master) printf("Iteration %d.\n", count + 1);
/* Here processes exchange their particles with each other. */
start_comm_time = MPI_Wtime();
MPI_Allgatherv ( particles + my_offset, particle_number,
particle_type,
particles, counts, displacements, particle_type,
MPI_COMM_WORLD );
end_comm_time = MPI_Wtime();
comm_time += end_comm_time - start_comm_time;
This is a yet another variant of a gather operation. It
differs from MPI_Allgather by allowing processes to contribute
arrays of different length. Remember that all contributions
in MPI_Allgather had to be of the same length. But here every
process contributes information about its particles to the pool and
the number of particles the processes look after may differ
from a process to a process (although in this simple program
they do not). Hence the need to use MPI_Allgatherv
rather than MPI_Allgather.
Each process contributes particles from the location in the array
given by particles + my_offset. This is an example of
pointer arithmetic. The meaning of this hideous, dangerous and
potentially misleading expression is &(particles[my_offset]),
and you could use the latter expression in this context too.
The length of the contributed array is particle_number (and
this number may be now different for each process) and
the type of the items in the array is particle_type.
So far this looks much the same as what we have seen in MPI_Allgather.
But now we have to assemble these contributions into array particles
keeping in mind that we may have received different numbers of particles
from participating processes. The two arrays that follow particles
in the arguments list specify how many items (count) of
type particle_type should be placed at the location
given by an entry in the array displacements. And so,
count[0] items of particle_type should go into the
location pointed by displacements[0] in the array particles,
count[1] items of particle_type should go into
the location pointed by displacements[1] in the array particles,
and so on.
After the operation completes, every process is going to have data pertaining
to all particles in its local copy of the array particles.
Now we can finally commence the computation. The i particles are the ones I am reponsible for (``I'' as a process, this is).
for (i = 0; i < particle_number; i++) {
true_i = i + my_offset;
/* initialize accelerations to zero */
particles[true_i].ax = 0.0;
particles[true_i].ay = 0.0;
particles[true_i].az = 0.0;
We begin by setting accelerations for the i particle to zero, i.e.,
for (j = 0; j < total_particles; j++) {
/* Do not evaluate interaction with yourself. */
if (j != true_i) {
/* Evaluate forces that j-particles exert on the i-particle. */
double dx, dy, dz, r2, r, qj_by_r3;
/* Here we absorb the minus sign by changing the order
of i and j. */
dx = particles[true_i].x - particles[j].x;
dy = particles[true_i].y - particles[j].y;
dz = particles[true_i].z - particles[j].z;
r2 = dx * dx + dy * dy + dz * dz; r = sqrt(r2);
Here we have evaluated
ri - rj = rji
= -rij, this
is what sits in dx, dy, dz. We have also evaluated
rji2 = rij2, this is what sits in r2 and
rji = rij, this is what sits in r.
It may happen that rji is so small that
qj/rji3 overflows.
Here we implement a very primitive quenching mechanism. If the particles
are closer to each other than
,
we switch the force between
them off. This is not physical, but it is safe. Real plasma codes
implement more elaborate means of dealing with this problem. If the
particles are not too close we evaluate
/* Quench the force if the particles are too close. */
if (r < EPSILON) qj_by_r3 = 0.0;
else qj_by_r3 = particles[j].charge / (r2 * r);
Now we can finally sum up the contributions from j particles to
the acceleration that acts on the i particle:
particles[true_i].ax += qj_by_r3 * dx;
particles[true_i].ay += qj_by_r3 * dy;
particles[true_i].az += qj_by_r3 * dz;
}
}
}
Having evaluated the accelerations using the old positions
of the i and j particles, we can now advance velocities of the
i particles,
and then advance their positions as well. We do this within the next
for(i loop:
/*
* We advance particle positions and velocities only *after*
* we have evaluated all accelerations using the *old* positions.
*/
for (i = 0; i < particle_number; i++) {
double qidt_by_m, dt_by_2, vx0, vy0, vz0;
true_i = i + my_offset;
But first we save current velocities
of the particles on auxiliary variables, because we will need them
to advance particle positions:
/* Save old velocities */
vx0 = particles[true_i].vx;
vy0 = particles[true_i].vy;
vz0 = particles[true_i].vz;
And now we implement the formula:
/* Now advance the velocity of particle i */
qidt_by_m = particles[true_i].charge * dt / particles[true_i].mass;
particles[true_i].vx += particles[true_i].ax * qidt_by_m;
particles[true_i].vy += particles[true_i].ay * qidt_by_m;
particles[true_i].vz += particles[true_i].az * qidt_by_m;
Finally, we push the i particle, i.e., advance their positions
using the formula:
/* Use average velocity in the interval to advance the particles'
positions */
dt_by_2 = 0.5 * dt;
particles[true_i].x += (vx0 + particles[true_i].vx) * dt_by_2;
particles[true_i].y += (vy0 + particles[true_i].vy) * dt_by_2;
particles[true_i].z += (vz0 + particles[true_i].vz) * dt_by_2;
}
This ends the computation within the top level for(count loop.
We repeat this loop for
N_OF_ITERATIONS times - thus performing N_OF_ITERATIONS
time steps.
Having completed the computation we meet at the barrier and the master process prints on standard output how much time it spent within MPI communications and how much time it spent computing.
MPI_Barrier(MPI_COMM_WORLD);
end_time = MPI_Wtime();
if (i_am_the_master) {
printf ("Communication time %8.5f seconds\n", comm_time);
printf ("Computation time %8.5f seconds\n",
end_time - start_time - comm_time);
printf ("\tEvaluated %d interactions\n",
N_OF_ITERATIONS * total_particles * (total_particles - 1));
}
MPI_Finalize ();
exit(0);
}
Then all processes meet again on MPI_Finalize and exit.