<< Chapter < Page Chapter >> Page >

Assuming a large value for N, the previous loop was an ideal candidate for loop unrolling. The iterations could be executed in any order, and the loop innards were small. But as you might suspect, this isn’t always the case; some kinds of loops can’t be unrolled so easily. Additionally, the way a loop is used when the program runs can disqualify it for loop unrolling, even if it looks promising.

In this section we are going to discuss a few categories of loops that are generally not prime candidates for unrolling, and give you some ideas of what you can do about them. We talked about several of these in the previous chapter as well, but they are also relevant here.

Loops with low trip counts

To be effective, loop unrolling requires a fairly large number of iterations in the original loop. To understand why, picture what happens if the total iteration count is low, perhaps less than 10, or even less than 4. With a trip count this low, the preconditioning loop is doing a proportionately large amount of the work. It’s not supposed to be that way. The preconditioning loop is supposed to catch the few leftover iterations missed by the unrolled, main loop. However, when the trip count is low, you make one or two passes through the unrolled loop, plus one or two passes through the preconditioning loop. In other words, you have more clutter; the loop shouldn’t have been unrolled in the first place.

Probably the only time it makes sense to unroll a loop with a low trip count is when the number of iterations is constant and known at compile time. For instance, suppose you had the following loop:


PARAMETER (NITER = 3) DO I=1,NITERA(I) = B(I) * C ENDDO

Because NITER is hardwired to 3, you can safely unroll to a depth of 3 without worrying about a preconditioning loop. In fact, you can throw out the loop structure altogether and leave just the unrolled loop innards:


PARAMETER (NITER = 3) A(1) = B(1) * CA(2) = B(2) * C A(3) = A(3) * C

Of course, if a loop’s trip count is low, it probably won’t contribute significantly to the overall runtime, unless you find such a loop at the center of a larger loop. Then you either want to unroll it completely or leave it alone.

Fat loops

Loop unrolling helps performance because it fattens up a loop with more calculations per iteration. By the same token, if a particular loop is already fat, unrolling isn’t going to help. The loop overhead is already spread over a fair number of instructions. In fact, unrolling a fat loop may even slow your program down because it increases the size of the text segment, placing an added burden on the memory system (we’ll explain this in greater detail shortly). A good rule of thumb is to look elsewhere for performance when the loop innards exceed three or four statements.

Loops containing procedure calls

As with fat loops, loops containing subroutine or function calls generally aren’t good candidates for unrolling. There are several reasons. First, they often contain a fair number of instructions already. And if the subroutine being called is fat, it makes the loop that calls it fat as well. The size of the loop may not be apparent when you look at the loop; the function call can conceal many more instructions.

Second, when the calling routine and the subroutine are compiled separately, it’s impossible for the compiler to intermix instructions. A loop that is unrolled into a series of function calls behaves much like the original loop, before unrolling.

Last, function call overhead is expensive. Registers have to be saved; argument lists have to be prepared. The time spent calling and returning from a subroutine can be much greater than that of the loop overhead. Unrolling to amortize the cost of the loop structure over several calls doesn’t buy you enough to be worth the effort.

The general rule when dealing with procedures is to first try to eliminate them in the “remove clutter” phase, and when this has been done, check to see if unrolling gives an additional performance improvement.

Loops with branches in them

In [link] we showed you how to eliminate certain types of branches, but of course, we couldn’t get rid of them all. In cases of iteration-independent branches, there might be some benefit to loop unrolling. The IF test becomes part of the operations that must be counted to determine the value of loop unrolling. Below is a doubly nested loop. The inner loop tests the value of B(J,I) :


DO I=1,N DO J=1,NIF (B(J,I) .GT. 1.0) A(J,I) = A(J,I) + B(J,I) * C ENDDOENDDO

Each iteration is independent of every other, so unrolling it won’t be a problem. We’ll just leave the outer loop undisturbed:


II = IMOD (N,4) DO I=1,NDO J=1,II IF (B(J,I) .GT. 1.0)+ A(J,I) = A(J,I) + B(J,I) * C ENDDODO J=II+1,N,4 IF (B(J,I) .GT. 1.0)+ A(J,I) = A(J,I) + B(J,I) * C IF (B(J+1,I) .GT. 1.0)+ A(J+1,I) = A(J+1,I) + B(J+1,I) * C IF (B(J+2,I) .GT. 1.0)+ A(J+2,I) = A(J+2,I) + B(J+2,I) * C IF (B(J+3,I) .GT. 1.0)+ A(J+3,I) = A(J+3,I) + B(J+3,I) * C ENDDOENDDO

This approach works particularly well if the processor you are using supports conditional execution. As described earlier, conditional execution can replace a branch and an operation with a single conditionally executed assignment. On a superscalar processor with conditional execution, this unrolled loop executes quite nicely.

Questions & Answers

what is the stm
Brian Reply
is there industrial application of fullrenes. What is the method to prepare fullrene on large scale.?
Rafiq
industrial application...? mmm I think on the medical side as drug carrier, but you should go deeper on your research, I may be wrong
Damian
How we are making nano material?
LITNING Reply
what is a peer
LITNING Reply
What is meant by 'nano scale'?
LITNING Reply
What is STMs full form?
LITNING
scanning tunneling microscope
Sahil
how nano science is used for hydrophobicity
Santosh
Do u think that Graphene and Fullrene fiber can be used to make Air Plane body structure the lightest and strongest. Rafiq
Rafiq
what is differents between GO and RGO?
Mahi
what is simplest way to understand the applications of nano robots used to detect the cancer affected cell of human body.? How this robot is carried to required site of body cell.? what will be the carrier material and how can be detected that correct delivery of drug is done Rafiq
Rafiq
what is Nano technology ?
Bob Reply
write examples of Nano molecule?
Bob
The nanotechnology is as new science, to scale nanometric
brayan
nanotechnology is the study, desing, synthesis, manipulation and application of materials and functional systems through control of matter at nanoscale
Damian
Is there any normative that regulates the use of silver nanoparticles?
Damian Reply
what king of growth are you checking .?
Renato
What fields keep nano created devices from performing or assimulating ? Magnetic fields ? Are do they assimilate ?
Stoney Reply
why we need to study biomolecules, molecular biology in nanotechnology?
Adin Reply
?
Kyle
yes I'm doing my masters in nanotechnology, we are being studying all these domains as well..
Adin
why?
Adin
what school?
Kyle
biomolecules are e building blocks of every organics and inorganic materials.
Joe
anyone know any internet site where one can find nanotechnology papers?
Damian Reply
research.net
kanaga
sciencedirect big data base
Ernesto
Introduction about quantum dots in nanotechnology
Praveena Reply
what does nano mean?
Anassong Reply
nano basically means 10^(-9). nanometer is a unit to measure length.
Bharti
do you think it's worthwhile in the long term to study the effects and possibilities of nanotechnology on viral treatment?
Damian Reply
absolutely yes
Daniel
how to know photocatalytic properties of tio2 nanoparticles...what to do now
Akash Reply
it is a goid question and i want to know the answer as well
Maciej
characteristics of micro business
Abigail
for teaching engĺish at school how nano technology help us
Anassong
How can I make nanorobot?
Lily
Do somebody tell me a best nano engineering book for beginners?
s. Reply
there is no specific books for beginners but there is book called principle of nanotechnology
NANO
how can I make nanorobot?
Lily
what is fullerene does it is used to make bukky balls
Devang Reply
are you nano engineer ?
s.
fullerene is a bucky ball aka Carbon 60 molecule. It was name by the architect Fuller. He design the geodesic dome. it resembles a soccer ball.
Tarell
what is the actual application of fullerenes nowadays?
Damian
That is a great question Damian. best way to answer that question is to Google it. there are hundreds of applications for buck minister fullerenes, from medical to aerospace. you can also find plenty of research papers that will give you great detail on the potential applications of fullerenes.
Tarell
how did you get the value of 2000N.What calculations are needed to arrive at it
Smarajit Reply
Privacy Information Security Software Version 1.1a
Good
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get the best Algebra and trigonometry course in your pocket!





Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask