next up previous index
Next: HDF5 Up: High Performance Data Management Previous: Exercises

The Assignment

The starting point for the assignment is the program you have developed in exercise 4 in section 5.4.3. But this time the computation is going to be a little different.

1.
Assume both matrix and temp to be real, not integer.
2.
Let every term of temp be constructed simply as follows:
temp[i][j] = 0.25 * (matrix[i-1][j] + matrix[i+1][j] +
                     matrix[i][j-1] + matrix[i][j+1]);
3.
Assume that the boundary conditions for the whole region, are fixed, i.e., they do not change throughout the computation. You will have to make sure that you don't overwrite them accidentally. Assume 500.0 on the left boundary and 0.0 on the remaining three boundaries of the region.
4.
The colours provided by MPE vary between MPE_WHITE = 0 and MPE_GRAY = 15. Display the values of matrix in such a way that 500.0 corresponds to 15 and 0.0 corresponds to 0. Ensure that you don't stray beyond 15 or beyond 0 and that the colour values are indeed integers, otherwise MPE will throw an error.
5.
Add timing to the program, as discussed in section 4.4. Measure the time used for computation and for generation of the data file. Develop the logic of the program so that it takes time allocated to the run from the environment or from the command line and that it dumps the checkpoint file and the log file before its time runs out. The program should log whether the computation has been completed or if it needs to be continued. The computation will be completed when the largest difference between temp[i][j] and matrix[i][j] for the whole computational region is less than $\epsilon$, the value of which should be obtained from the command line or from the environment.
6.
Develop a PBS script that is going to submit and resubmit the run until the computation has been completed.
7.
Test all functions developed above carefully. Provide a man page for the program describing all its features, possible bugs, command line arguments, etc. Provide a README file that describes in detail the compilation and installation of the program.
8.
The program itself should be meticulously commented so that its reader can understand your every step.
The assignment should be completed no later than by the 19thof December.

Method of Delivery
The program, its man page, the corresponding PBS scripts, and a README file that describes the whole lot, providing explanations about how to compile, install and run the program - interactively and under PBS - should be placed in a selected directory in your AVIDD $HOME. You should then e-mail the location of the directory to me, so that I can go there and check your work.

Once you have laboured enough on this assignment, you may wish to look at how this program can be written in High Performance Fortran. Connect to http://beige.ucs.indiana.edu/P573/node113.html. You will see there that the whole computation, which is carried out in parallel, is captured on the mere 6 lines of the following loop:

do i = 1, iterations
     where (mask)
        field = (eoshift(field, 1, dim=1) + eoshift(field, -1, dim=1) &
             + eoshift(field, 1, dim=2) + eoshift(field, -1, dim=2)) * 0.25
     end where
  end do

Additional Comments
Pointers  and arrays are not the same in C, even though they are often treated as the same, yet, this usually leads to problems. Furthermore, multidimensional arrays in C are really arrays of arrays. Consider the following code taken from life_g.c:
  int    **matrix, **temp, **addr ;
...
  matrix = (int **)malloc(sizeof(int *)*(mysize+2)) ;
  temp = (int **)malloc(sizeof(int *)*(mysize+2)) ;
  for (i = 0; i < mysize+2; i++) {
    matrix[i] = (int *)malloc(sizeof(int)*(matrix_size+2)) ;
    temp[i] = (int *)malloc(sizeof(int)*(matrix_size+2)) ;
  }
Observe that for every matrix[i] we call a separate malloc here. This means that you cannot assume that the data pointed to by matrix is going to be laid out in the computer's memory contiguously. Consequently, if you then attempt to describe it to MPI in terms of a stride and some matrix geometry, thinking of matrix as an array of rank 2, your description will not match the reality and when MPI attempts to retrieve data based on this description, it'll get the wrong data from the wrong locations and it will write it on wrong locations too. You may even end up crashing the process, if you attempt the write.

This, however, is not the case for statically declared and defined arrays. If you define:

  double  matrix[500][500];
then the compiler will create a contiguous space for the $500\times500$doubles - then only columns will be columns, and the ``distance'' between the rows, i.e., the stride, will be the same for every pair of adjacent rows.
Although GNU and other antiquated C compilers may fail on large statically allocated arrays with segmentation fault, this should not happen if you use the Intel compiler. The MPI software installed in my $HOME is compiled using the Intel compiler.
How then can you create an array of rank 2 or higher dynamically, so that all the data in the array is laid out contiguously - the same way it is going to happen for a statically declared array? The answer is you cannot do this in C. C is not a good language for working with multidimensional arrays. Instead you have to malloc a single array of rank 1, i.e., a long vector and then design your own pointer arithmetic for it, so that a pair of indexes, [i][j], will translate into, say, i * NCOL + j.


next up previous index
Next: HDF5 Up: High Performance Data Management Previous: Exercises
Zdzislaw Meglicki
2004-04-29