<< Chapter < Page Chapter >> Page >

Memory

Let’s say that you are fast asleep some night and begin dreaming. In your dream, you have a time machine and a few 500-MHz four-way superscalar processors. You turn the time machine back to 1981. Once you arrive back in time, you go out and purchase an IBM PC with an Intel 8088 microprocessor running at 4.77 MHz. For much of the rest of the night, you toss and turn as you try to adapt the 500-MHz processor to the Intel 8088 socket using a soldering iron and Swiss Army knife. Just before you wake up, the new computer finally works, and you turn it on to run the Linpack See [link] Chapter 15, Using Published Benchmarks, for details on the Linpack benchmark. benchmark and issue a press release. Would you expect this to turn out to be a dream or a nightmare? Chances are good that it would turn out to be a nightmare, just like the previous night where you went back to the Middle Ages and put a jet engine on a horse. (You have got to stop eating double pepperoni pizzas so late at night.)

Even if you can speed up the computational aspects of a processor infinitely fast, you still must load and store the data and instructions to and from a memory. Today’s processors continue to creep ever closer to infinitely fast processing. Memory performance is increasing at a much slower rate (it will take longer for memory to become infinitely fast). Many of the interesting problems in high performance computing use a large amount of memory. As computers are getting faster, the size of problems they tend to operate on also goes up. The trouble is that when you want to solve these problems at high speeds, you need a memory system that is large, yet at the same time fast—a big challenge. Possible approaches include the following:

  • Every memory system component can be made individually fast enough to respond to every memory access request.
  • Slow memory can be accessed in a round-robin fashion (hopefully) to give the effect of a faster memory system.
  • The memory system design can be made “wide” so that each transfer contains many bytes of information.
  • The system can be divided into faster and slower portions and arranged so that the fast portion is used more often than the slow one.

Again, economics are the dominant force in the computer business. A cheap, statistically optimized memory system will be a better seller than a prohibitively expensive, blazingly fast one, so the first choice is not much of a choice at all. But these choices, used in combination, can attain a good fraction of the performance you would get if every component were fast. Chances are very good that your high performance workstation incorporates several or all of them.

Once the memory system has been decided upon, there are things we can do in software to see that it is used efficiently. A compiler that has some knowledge of the way memory is arranged and the details of the caches can optimize their use to some extent. The other place for optimizations is in user applications, as we’ll see later in the book. A good pattern of memory access will work with, rather than against, the components of the system.

In this chapter we discuss how the pieces of a memory system work. We look at how patterns of data and instruction access factor into your overall runtime, especially as CPU speeds increase. We also talk a bit about the performance implications of running in a virtual memory environment.

Questions & Answers

Why is b in the answer
Dahsolar Reply
how do you work it out?
Brad Reply
answer
Ernest
heheheehe
Nitin
(Pcos∅+qsin∅)/(pcos∅-psin∅)
John Reply
how to do that?
Rosemary Reply
what is it about?
Amoah
how to answer the activity
Chabelita Reply
how to solve the activity
Chabelita
solve for X,,4^X-6(2^)-16=0
Alieu Reply
x4xminus 2
Lominate
sobhan Singh jina uniwarcity tignomatry ka long answers tile questions
harish Reply
t he silly nut company makes two mixtures of nuts: mixture a and mixture b. a pound of mixture a contains 12 oz of peanuts, 3 oz of almonds and 1 oz of cashews and sells for $4. a pound of mixture b contains 12 oz of peanuts, 2 oz of almonds and 2 oz of cashews and sells for $5. the company has 1080
ZAHRO Reply
If  , , are the roots of the equation 3 2 0, x px qx r     Find the value of 1  .
Swetha Reply
Parts of a pole were painted red, blue and yellow. 3/5 of the pole was red and 7/8 was painted blue. What part was painted yellow?
Patrick Reply
Parts of the pole was painted red, blue and yellow. 3 /5 of the pole was red and 7 /8 was painted blue. What part was painted yellow?
Patrick
how I can simplify algebraic expressions
Katleho Reply
Lairene and Mae are joking that their combined ages equal Sam’s age. If Lairene is twice Mae’s age and Sam is 69 yrs old, what are Lairene’s and Mae’s ages?
Mary Reply
23yrs
Yeboah
lairenea's age is 23yrs
ACKA
hy
Katleho
Ello everyone
Katleho
Laurene is 46 yrs and Mae is 23 is
Solomon
hey people
christopher
age does not matter
christopher
solve for X, 4^x-6(2*)-16=0
Alieu
prove`x^3-3x-2cosA=0 (-π<A<=π
Mayank Reply
create a lesson plan about this lesson
Rose Reply
Excusme but what are you wrot?
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask