<< Chapter < Page Chapter >> Page >

Extracting a sample population

Suppose you are trying to find out what percentage of South Africa's population owns a car. One way of doing this might be to send questionnaires to peoples homes, asking them whether they own a car. However, you quickly run into a problem: you cannot hope to send every person in the country a questionnaire, it would be far to expensive. Also, not everyone would reply. The best you can do is send it to a few people, see what percentage of these own a car, and then use this to estimate what percentage of the entire country own cars. This smaller group of people is called the sample population .

The sample population must be carefully chosen, in order to avoid biased results. How do we do this?

First, it must be representative . If all of our sample population comes from a very rich area, then almost all will have cars. But we obviously cannot conclude from this that almost everyone in the country has a car! We need to send the questionnaire to rich as well as poor people.

Secondly, the size of the sample population must be large enough. It is no good having a sample population consisting of only two people, for example. Both may very well not have cars. But we obviously cannot conclude that no one in the country has a car! The larger the sample population size, the more likely it is that the statistics of our sample population corresponds to the statistics of the entire population.

So how does one ensure that ones sample is representative? There are a variety of methods available, which we will look at now.

  1. Random Sampling. Every person in the country has an equal chance of being selected. It is unbiased and also independant, which means that the selection of one person has no effect on the selection on another. One way of doing this would be to give each person in the country a number, and then ask a computer to give us a list of random numbers. We could then send the questionnaire to the people corresponding to the random numbers.
  2. Systematic Sampling. Again give every person in the country a number, and then, for example, select every hundredth person on the list. So person with number 1 would be selected, person with number 100 would be selected, person with number 200 would be selected, etc.
  3. Stratified Sampling. We consider different subgroups of the population, and take random samples from these. For example, we can divide the population into male and female, different ages, or into different income ranges.
  4. Cluster Sampling. Here the sample is concentrated in one area. For example, we consider all the people living in one urban area.

Sampling

  1. Discuss the advantages, disadvantages and possible bias when using
    1. systematic sampling
    2. random sampling
    3. cluster sampling
  2. Suggest a suitable sampling method that could be used to obtain information on:
    1. passengers views on availability of a local taxi service.
    2. views of learners on school meals.
    3. defects in an item made in a factory.
    4. medical costs of employees in a large company.
  3. 2 % of a certain magazines' subscribers is randomly selected. The random number 16 out of 50, is selected. Then subscribers with numbers 16, 66, 116, 166, ... are chosen as a sample. What kind of sampling is this?

Function fitting and regression analysis

In Grade 11 we recorded two sets of data (bivariate data) on a scatter plot and then we drew a line of best fit as close to as many of the data items as possible. Regression analysis is a method of finding out exactly which function best fits a given set of data. We can find out the equation of the regression line by drawing and estimating, or by using an algebraic method called “the least squares method”, available on most scientific calculators. The linear regression equation is written y ^ = a + b x (we say y-hat) or y = A + B x . Of course these are both variations of a more familiar equation y = m x + c .

Suppose you are doing an experiment with washing dishes. You count how many dishes you begin with, and then find out how long it takes to finish washing them. So you plot the data on a graph of time taken versus number of dishes. This is plotted below.

If t is the time taken, and d the number of dishes, then it looks as though t is proportional to d , ie. t = m · d , where m is the constant of proportionality. There are two questions that interest us now.

  1. How do we find m ? One way you have already learnt, is to draw a line of best-fit through the data points, and then measure the gradient of the line. But this is not terribly precise. Is there a better way of doing it?
  2. How well does our line of best fit really fit our data? If the points on our plot don't all lie close to the line of best fit, but are scattered everywhere, then the fit is not 'good', and our assumption that t = m · d might be incorrect. Can we find a quantitative measure of how well our line really fits the data?

In this chapter, we answer both of these questions, using the techniques of regression analysis .

Phet simulation for curve fitting

Use the data given to draw a scatter plot and line of best fit. Now write down the equation of the line that best seems to fit the data.

x 1,0 2,4 3,1 4,9 5,6 6,2
y 2,5 2,8 3,0 4,8 5,1 5,3
  1. The first step is to draw the graph. This is shown below.

  2. The equation of the line is

    y = m x + c

    From the graph we have drawn, we estimate the y-intercept to be 1,5. We estimate that y = 3 , 5 when x = 3 . So we have that points ( 3 ; 3 , 5 ) and ( 0 ; 1 , 5 ) lie on the line. The gradient of the line, m , is given by

    m = y 2 - y 1 x 2 - x 1 = 3 , 5 - 1 , 5 3 - 0 = 2 3

    So we finally have that the equation of the line of best fit is

    y = 2 3 x + 1 , 5

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Siyavula textbooks: grade 12 maths. OpenStax CNX. Aug 03, 2011 Download for free at http://cnx.org/content/col11242/1.2
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Siyavula textbooks: grade 12 maths' conversation and receive update notifications?

Ask