<< Chapter < Page Chapter >> Page >
  • The sampling process is the iid class { X i : 1 i n } .
  • A random sample is an observation, or realization, ( t 1 , t 2 , , t n ) of the sampling process.

The sample average and the population mean

Consider the numerical average of the values in the sample x ¯ = 1 n i = 1 n t i . This is an observation of the sample average

A n = 1 n i = 1 n X i = 1 n S n

The sample sum S n and the sample average A n are random variables. If another observation were made (another sample taken), the observed value of these quantities wouldprobably be different. Now S n and A n are functions of the random variables { X i : 1 i n } in the sampling process. As such, they have distributions related to the population distribution (the common distribution of the X i ). According to the central limit theorem, for any reasonable sized sample they should be approximatelynormally distributed. As the examples demonstrating the central limit theorem show, the sample size need not be large in many cases. Now if the population mean E [ X ] is μ and the population variance Var [ X ] is σ 2 , then

E [ S n ] = i = 1 n E [ X i ] = n E [ X ] = n μ and Var [ S n ] = i = 1 n Var [ X i ] = n Var [ X ] = n σ 2

so that

E [ A n ] = 1 n E [ S n ] = μ and Var [ A n ] = 1 n 2 Var [ S n ] = σ 2 / n

Herein lies the key to the usefulness of a large sample. The mean of the sample average A n is the same as the population mean, but the variance of the sample average is 1 / n times the population variance. Thus, for large enough sample, the probability is high that the observed value of the sample average will be close to the population mean . The population standard deviation, as a measure of the variation is reduced by a factor 1 / n .

Sample size

Suppose a population has mean μ and variance σ 2 . A sample of size n is to be taken. There are complementary questions:

  1. If n is given, what is the probability the sample average lies within distance a from the population mean?
  2. What value of n is required to ensure a probability of at least p that the sample average lies within distance a from the population mean?

SOLUTION

Suppose the sample variance is known or can be approximated reasonably. If the sample size n is reasonably large, depending on the population distribution (as seen in the previous demonstrations), then A n is approximately N ( μ , σ 2 / n ) .

  1. Sample size given, probability to be determined.
    p = P ( | A n - μ | a ) = P A n - μ σ / n a n σ = 2 Φ ( a n / σ ) - 1
  2. Sample size to be determined, probability specified.
    2 Φ ( a n / σ ) - 1 p iff Φ ( a n / σ ) p + 1 2
    Find from a table or by use of the inverse normal function the value of x = a n / σ required to make Φ ( x ) at least ( p + 1 ) / 2 . Then
    n σ 2 ( x / a ) 2 = σ a 2 x 2

We may use the MATLAB function norminv to calculate values of x for various p .

p = [0.8 0.9 0.95 0.98 0.99];x = norminv(0,1,(1+p)/2); disp([p;x;x.^2]') 0.8000 1.2816 1.64240.9000 1.6449 2.7055 0.9500 1.9600 3.84150.9800 2.3263 5.4119 0.9900 2.5758 6.6349

For p = 0 . 95 , σ = 2 , a = 0 . 2 , n ( 2 / 0 . 2 ) 2 3 . 8415 = 384 . 15 . Use at least 385 or perhaps 400 because of uncertainty about the actual σ 2

Got questions? Get instant answers now!

The idea of a statistic

As a function of the random variables in the sampling process, the sample average is an example of a statistic.

Definition . A statistic is a function of the class { X i : 1 i n } which uses explicitly no unknown parameters of the population.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Applied probability. OpenStax CNX. Aug 31, 2009 Download for free at http://cnx.org/content/col10708/1.6
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Applied probability' conversation and receive update notifications?

Ask