<< Chapter < Page Chapter >> Page >

The above example actually highlights one of the challenges in producing an efficient implementation of FORTRAN 90. If these arrays contained 10 million elements, and the compiler used a simple approach, it would need 30 million elements for the old "left" values, the old "right" values, and for the new values. Data flow optimization is needed to determine just how much extra data must be maintained to give the proper results. If the compiler is clever, the extra memory can be quite small:

Data alignment and computations

this figure shows four rows of connected boxes. The first row is labeled, Left Values, and the boxes are numbered from 1 to 10, with 9 and 10 shaded grey. to the right of the row is the label, ROD (1:8). Below this row is the label, ADD.  The second row is labeled, Right Values, and is numbered from 1 to 10, with 1 and 2 shaded grey. to the right of the row is the label, ROD (3:10). Below this row is the label, Divide the sum by 2. The third row is labeled, temporary space, and the boxes are numbered from 1 to 8. Below boxes 1, 2, 7, and 8 are arrows pointing down at the fourth row. Below the third row is the label, and THEN assign. The fourth row is labeled, Results, and is numbered from 1 to 10, with boxes 1 and 10 shaded grey. To the right of the row is the label, ROD (2:9)

SAVE1 = ROD(1) DO I=2,9SAVE2 = ROD(I) ROD(I) = (SAVE1 + ROD(I+1) ) / 2SAVE1 = SAVE2 ENDDO

This does not have the parallelism that the full red-black implementation has, but it does produce the correct results with only two extra data elements. The trick is to save the old "left" value just before you wipe it out. A good FORTRAN 90 compiler uses data flow analysis, looking at a template of how the computation moves across the data to see if it can save a few elements for a short period of time to alleviate the need for a complete extra copy of the data.

The advantage of the FORTRAN 90 language is that it's up to the compiler whether it uses a complete copy of the array or a few data elements to insure that the program executes properly. Most importantly, it can change its approach as you move from one architecture to another.

Fortran 90 versus fortran 77

Interestingly, FORTRAN 90 has never been fully embraced by the high performance community. There are a few reasons why:

  • There is a concern that the use of pointers and dynamic data structures would ruin performance and lose the optimization advantages of FORTRAN over C. Some people would say that FORTRAN 90 is trying to be a better C than C. Others would say, "who wants to become more like the slower language!" Whatever the reason, there was some controversy when FORTRAN 90 was implemented, leading to some reluctance in adoption by programmers. Some vendors said, "You can use FORTRAN 90, but FORTRAN 77 will always be faster."
  • Because vendors often implemented different subsets of FORTRAN 90, it was not as portable as FORTRAN 77. Because of this, users who needed maximum portability stuck with FORTRAN 77.
  • Sometimes vendors purchased their fully compliant FORTRAN 90 compilers from a third party who demanded high license fees. So, you could get the free (and faster according to the vendor) FORTRAN 77 or pay for the slower (wink wink) FORTRAN 90 compiler.
  • Because of these factors, the number of serious applications developed in FORTRAN 90 was small. So the benchmarks used to purchase new systems were almost exclusively FORTRAN 77. This further motivated the vendors to improve their FORTRAN 77 compilers instead of their FORTRAN 90 compilers.
  • As the FORTRAN 77 compilers became more sophisticated using data flow analysis, it became relatively easy to write portable "parallel" code in FORTRAN 77, using the techniques we have discussed in this book.
  • One of the greatest potential benefits to FORTRAN 90 was portability between SIMD and the parallel/vector supercomputers. As both of these architectures were replaced with the shared uniform memory multiprocessors, FORTRAN 77 became the language that afforded the maximum portability across the computers typically used by high performance computing programmers.
  • The FORTRAN 77 compilers supported directives that allowed programmers to fine-tune the performance of their applications by taking full control of the parallelism. Certain dialects of FORTRAN 77 essentially became parallel programming "assembly language." Even highly tuned versions of these codes were relatively portable across the different vendor shared uniform memory multiprocessors.

So, events conspired against FORTRAN 90 in the short run. However, FORTRAN 77 is not well suited for the distributed memory systems because it does not lend itself well to data layout directives. As we need to partition and distribute the data carefully on these new systems, we must give the compiler lots of flexibility. FORTRAN 90 is the language best suited to this purpose.

Fortran 90 summary

Well, that's the whirlwind tour of FORTRAN 90. We have probably done the language a disservice by covering it so briefly, but we wanted to give you a feel for it. There are many features that were not discussed. If you would like to learn more, we recommend FORTRAN 90 Explained , by Michael Metcalf and John Reid (Oxford University Press) .

FORTRAN 90 by itself is not sufficient to give us scalable performance on distributed memory systems. So far, compilers are not yet capable of performing enough data flow analysis to decide where to store the data and when to retrieve the memory. So, for now, we programmers must get involved with the data layout. We must decompose the problem into parallel chunks that can be individually processed. We have several options. We can use High Performance FORTRAN and leave some of the details to the compiler, or we can use explicit message-passing and take care of all of the details ourselves.

Questions & Answers

Is there any normative that regulates the use of silver nanoparticles?
Damian Reply
what king of growth are you checking .?
What fields keep nano created devices from performing or assimulating ? Magnetic fields ? Are do they assimilate ?
Stoney Reply
why we need to study biomolecules, molecular biology in nanotechnology?
Adin Reply
yes I'm doing my masters in nanotechnology, we are being studying all these domains as well..
what school?
biomolecules are e building blocks of every organics and inorganic materials.
anyone know any internet site where one can find nanotechnology papers?
Damian Reply
sciencedirect big data base
Introduction about quantum dots in nanotechnology
Praveena Reply
what does nano mean?
Anassong Reply
nano basically means 10^(-9). nanometer is a unit to measure length.
do you think it's worthwhile in the long term to study the effects and possibilities of nanotechnology on viral treatment?
Damian Reply
absolutely yes
how to know photocatalytic properties of tio2 nanoparticles...what to do now
Akash Reply
it is a goid question and i want to know the answer as well
characteristics of micro business
for teaching engĺish at school how nano technology help us
Do somebody tell me a best nano engineering book for beginners?
s. Reply
there is no specific books for beginners but there is book called principle of nanotechnology
what is fullerene does it is used to make bukky balls
Devang Reply
are you nano engineer ?
fullerene is a bucky ball aka Carbon 60 molecule. It was name by the architect Fuller. He design the geodesic dome. it resembles a soccer ball.
what is the actual application of fullerenes nowadays?
That is a great question Damian. best way to answer that question is to Google it. there are hundreds of applications for buck minister fullerenes, from medical to aerospace. you can also find plenty of research papers that will give you great detail on the potential applications of fullerenes.
what is the Synthesis, properties,and applications of carbon nano chemistry
Abhijith Reply
Mostly, they use nano carbon for electronics and for materials to be strengthened.
is Bucky paper clear?
carbon nanotubes has various application in fuel cells membrane, current research on cancer drug,and in electronics MEMS and NEMS etc
so some one know about replacing silicon atom with phosphorous in semiconductors device?
s. Reply
Yeah, it is a pain to say the least. You basically have to heat the substarte up to around 1000 degrees celcius then pass phosphene gas over top of it, which is explosive and toxic by the way, under very low pressure.
Do you know which machine is used to that process?
how to fabricate graphene ink ?
for screen printed electrodes ?
What is lattice structure?
s. Reply
of graphene you mean?
or in general
in general
Graphene has a hexagonal structure
On having this app for quite a bit time, Haven't realised there's a chat room in it.
what is biological synthesis of nanoparticles
Sanket Reply
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get the best Algebra and trigonometry course in your pocket!

Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?