So far we have used the implementation of the Euler gamma function,
,
provided with the Solaris version of Fortran (actually,
its logarithm). This
makes the code non-portable. For example, we couldn't find an equivalent
of Euler gamma function in IRIX-6.3 Fortran. Even if we could, the name
of the subroutine could well be different.
But the world doesn't end on Solaris. Today, as well as in future, you may have to work with a variety of operating systems and hardware architectures: VMS is still around, so is MVS, and the latter is likely to go on for many years to come.
NT for all practical purposes is restricted to just one hardware architecture and aimed specially at small offices and enterprises. This makes it an unattractive system to work with, if you are a scientist or an engineer, because neither Intel hardware nor Microsoft software are particularly sympathetic towards the needs of scientists and engineers, who represent a niche market. But for one reason or another you may come across those systems in your professional career too. In may institutions and companies clerks and bureaucrats decide about institutional purchases, and, like everybody else, they are likely to be swayed by salesman pitch, propaganda, and whatever box they happen to have sitting on their own desk. Once upon a time Intel boxes used to be markedly cheaper than UNIX workstations. Today this is no longer the case, but many people haven't woken up to this fact yet.
Macs are coming back, and many scientists and engineers like them, because they are easy to use, even though they are just as clumsy and inflexible when it comes to doing science as NT.
Finally, we can't exclude that new architectures and operating systems may appear in future. None of the stuff that we see on the market today is ideal, and there is no reason why people shouldn't work on better hardware architectures and better operating systems.
Anyhow, if you want your Fortran program to compile and work seamlessly across all platforms, you should provide your own definitions for all functions and subroutines called by the program that are not a part of ANSI Fortran.
But there is also another reason to do so. The Euler gamma function is a very complicated function with poles at 0 and all negative integers, defined in the whole complex plane (with the exception of the poles, that is). There are many expressions that approximate the gamma function in various regimes quite well, or sometimes extraordinarily well. A general purpose gamma function provided, for example, with NAG or IMSL libraries, may have numerous branches and invoke various evaluation techniques depending on the argument. If you have to call such a gamma function very frequently, the cost of doing so may be quite high. So, if you know that your arguments are going to be restricted to a certain narrow range, you may be better off writing your own lightweight implementation.
In the case of our program, we really want an inverse of the gamma function and not the gamma function itself. The inverse is a lot easier to calculate, because it is a very regular function without any poles and with only a very moderate ``pulsing'' growth for x < 0.
There are many ways to calculate
.
For example we
could use Euler's infinite product
![]() |
(2.35) |
| (2.36) |
A Taylor expansion of this formula around z = 0
can be found in ``Tables of Higher Mathematical
Functions'', by H. T. Davis (later corrected by H. E. Salzer),
published here in Bloomington, Indiana,
in 1933 and in 1935. The expansion looks as follows:
| c1 | = | 1 | . |
|
| c2 | = | 0 | . |
|
| c3 | = | -0 | . |
|
| c4 | = | -0 | . |
|
| c5 | = | 0 | . |
|
| c6 | = | -0 | . |
|
| c7 | = | -0 | . |
|
| c8 | = | 0 | . |
|
| c9 | = | -0 | . |
|
| c10 | = | -0 | . |
|
| c11 | = | 0 | . |
|
| c12 | = | -0 | . |
|
| c13 | = | -0 | . |
|
| c14 | = | 0 | . |
|
| c15 | = | -0 | . |
|
| c16 | = | 0 | . |
|
| c17 | = | 0 | . |
|
| c18 | = | -0 | . |
|
| c19 | = | 0 | . |
|
| c20 | = | 0 | . |
|
| c21 | = | -0 | . |
|
| c22 | = | 0 | . |
|
| c23 | = | -0 | . |
|
| c24 | = | -0 | . |
|
| c25 | = | 0 | . |
|
| c26 | = | 0 | . |
|
| (2.38) |
The Bloomington expansion may well have been someone's master's dissertation. Recall that in those days all computations would have to be done by hand, or, at best, by using a mechanical calculator that could do additions and subtractions quite well, but was rather cumbersome when it came to multiplications and divisions. For example, to multiply a number by 3, you would have to turn its knob three times, and to divide by 3, you would have to turn the knob three times in the other direction.
Figure 2.17 shows an implementation of this function.
We will not use the Bloomington expansion in our computations, but the program shown in Figure 2.17 illustrates the use of array constructors in Fortran. You can either have a verbatim constructor, which, for an integer array, k, of dimension 5, say, could look as follows:
k = (/ 1, 2, 3, 4, 5 /)This statement assigns 1 to k(1), 2 to k(2), and so on until 5, which is assigned to k(5).
The same effect can be accomplished by using the so called ``implied DO'' array constructor:
k = (/ (i, i=1, 5) /)
Lisp programmers should recognise Fortran array constructors as related to array constructors in Common Lisp , although markedly less verbose. In Common Lisp we could say, for example
>(make-array 5 :initial-contents '(1 2 3 4 5))
#(1 2 3 4 5)
>(make-array 5 :initial-contents
(do ((mylist nil)
(i 1 (1+ i)))
((> i 5) mylist)
(setq mylist (append mylist (list i)))))
#(1 2 3 4 5)
>
You should appreciate how much faster and cleaner
k = (/ (i, i=1, 5) /) is compared to the Lisp (do ...) form.
On the other hand the Lisp way has a great flexibility about it: you can
insert various Lisp forms including (do ...) basically anywhere
within a function call.
Press, Flannery, Teukolsky, and Vetterling
recommend a formula by
Lanczos , which is a lot cheaper
to compute than the Bloomington
expansion and which is a lot more accurate too. The formula looks
as follows:
| = | ![]() |
||
![]() |
(2.39) |
For
,
n = 6 (in the Lanczos formula), and for
a specific choice of ck (see the code in Figure 2.18),
we get that
everywhere
in the half-complex plane
Re(z) > 0.
Figure 2.18 shows the implementation of this formula as a
Fortran function my_gamma.
![]() |
Although it is common to calculate
on entry to
a subroutine by calling
(which in Fortran
would be 2*acos(0.0)), this is costly, especially
if the function has to be invoked many times. It is easy enough
to evaluate
with desired accuracy using,
for example, Maxima:
(C2) ev(sqrt(2 * %pi), numer); (D2) 2.5066282746310002 (C3)You can also invoke Octave , which we haven't talked about yet, but which, together with Matlab, we'll begin using quite soon. Octave is a program somewhat similar to Maxima, but specialising in numerical mathematics rather than in symbolic mathematics. In Octave you would calculate
octave:4> printf("%20.16f\n", sqrt(2*pi));
2.5066282746310002
octave:5>
We can test our implementation of
against the Solaris
implementation by running the following simple Fortran program:
program try
use euler
integer, parameter :: n = 25
real (kind=long), parameter :: x_zero = 1.0_long
real (kind=long) :: x
integer :: i
do i = 1, n
x = x_zero + 0.4_long * i
write(*, '(1f6.1, 2e20.10)') x, gamma(x), my_gamma(x)
end do
end program try
Figure 2.19 shows the result of the run.
![]() |
Function my_gamma illustrated in Figure 2.18 is
not optimal, because it involves two exponentiations and
three multiplications. If we were
to evaluate
instead of
itself, we could save some time by replacing logarithms of multiplications
with additions of logarithms and exponentiations with multiplications.
Figure 2.20 shows the implementation of
a function that calculates
.
The following simple test program can be used to time calls to
my_gamma versus calls to my_lgamma:
program try
use euler
integer, parameter :: n = 1000000
real (kind=long), parameter :: x_zero = 1.0_long
real (kind=long) :: x, throw_away
integer :: i
do i = 1, n
x = x_zero + 0.00001_long * i
! throw_away = exp(my_lgamma(x))
throw_away = my_gamma(x)
end do
end program try
This program calls either my_gamma or my_lgamma million
times. The
d_lgamma This level of optimisation can be achieved only by coding the function directly in the assembler or by utilising its hardware implementation. On many UNIX systems, especially the ones aimed directly at scientific and engineering markets, this may frequently be the case.
Similarly, many window operations are assembler-level or even hardware-level optimised under NT and MacOS.