<< Chapter < Page | Chapter >> Page > |
For most of the work you do in this book, you will use a histogram to display the data. One advantage of a histogram is that it can readily display large data sets. A rule of thumb is to usea histogram when the data set consists of 100 values or more.
A histogram consists of contiguous boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents (for instance, distance from yourhome to school). The vertical axis is labeled either Frequency or relative frequency . The graph will have the same shape with either label. The histogram (like the stemplot) can give you the shape of thedata, the center, and the spread of the data. (The next section tells you how to calculate the center and the spread.)
The relative frequency is equal to the frequency for an observed value of the data divided by the total number of data values in the sample. (In the chapter on Sampling and Data , we defined frequency as the number of times an answer occurs.) If:
then:
For example, if 3 students in Mr. Ahab's English class of 40 students received from 90% to 100%, then,
$f=3$ , $n=\text{40}$ , and $\text{RF}=\frac{f}{n}=\frac{3}{\text{40}}=0\text{.}\text{075}$
Seven and a half percent of the students received 90% to 100%. Ninety percent to 100 % are quantitative measures.
To construct a histogram, first decide how many bars or intervals , also called classes, represent the data. Many histograms consist of from 5 to 15 bars or classes for clarity. Choose a starting point for the first interval to be less than the smallest data value. A convenient starting point is a lower value carried out to one more decimal place than the value with the most decimal places. For example, if the value with the most decimal places is 6.1 and this is the smallest value, a convenient starting point is 6.05 (6.1 - 0.05 = 6.05). We say that 6.05 hasmore precision. If the value with the most decimal places is 2.23 and the lowest value is 1.5, a convenient starting point is 1.495 (1.5 - 0.005 = 1.495). If the value with the most decimal places is 3.234 and the lowest value is 1.0, a convenient starting point is 0.9995 (1.0 - .0005 = 0.9995). If all the data happen to be integers and the smallest value is 2, then a convenient starting point is 1.5 (2 - 0.5 = 1.5). Also, when the starting point and other boundaries are carried to one additional decimal place, no data value will fall on a boundary.
The following data are the heights (in inches to the nearest half inch) of 100 male semiprofessional soccer players. The heights are continuous data since height is measured.
Notification Switch
Would you like to follow the 'Principles of business statistics' conversation and receive update notifications?