# 2.6 Skewness and the mean, median, and mode

 Page 1 / 2

Consider the following data set.
4; 5; 6; 6; 6; 7; 7; 7; 7; 7; 7; 8; 8; 8; 9; 10

This data set can be represented by following histogram. Each interval has width one, and each value is located in the middle of an interval.

The histogram displays a symmetrical distribution of data. A distribution is symmetrical if a vertical line can be drawn at some point in the histogram such that the shape to the left and the right of the vertical line are mirror images of each other. The mean, the median, and the mode are each seven for these data. In a perfectly symmetrical distribution, the mean and the median are the same. This example has one mode (unimodal), and the mode is the same as the mean and median. In a symmetrical distribution that has two modes (bimodal), the two modes would be different from the mean and median.

The histogram for the data:

• 4
• 5
• 6
• 6
• 6
• 7
• 7
• 7
• 7
• 8
is not symmetrical. The right-hand side seems "chopped off" compared to the left side. A distribution of this type is called skewed to the left because it is pulled out to the left.

The mean is 6.3, the median is 6.5, and the mode is seven. Notice that the mean is less than the median, and they are both less than the mode. The mean and the median both reflect the skewing, but the mean reflects it more so.

The histogram for the data:

• 6
• 7
• 7
• 7
• 7
• 8
• 8
• 8
• 9
• 10
, is also not symmetrical. It is skewed to the right .

The mean is 7.7, the median is 7.5, and the mode is seven. Of the three statistics, the mean is the largest, while the mode is the smallest . Again, the mean reflects the skewing the most.

To summarize, generally if the distribution of data is skewed to the left, the mean is less than the median, which is often less than the mode. If the distribution of data is skewed to the right, the mode is often less than the median, which is less than the mean.

Skewness and symmetry become important when we discuss probability distributions in later chapters.

Statistics are used to compare and sometimes identify authors. The following lists shows a simple random sample that compares the letter counts for three authors.

Terry: 7; 9; 3; 3; 3; 4; 1; 3; 2; 2

Davis: 3; 3; 3; 4; 1; 4; 3; 2; 3; 1

Maris: 2; 3; 4; 4; 4; 6; 6; 6; 8; 3

1. Make a dot plot for the three authors and compare the shapes.
2. Calculate the mean for each.
3. Calculate the median for each.
4. Describe any pattern you notice between the shape and the measures of center.
1. Terry’s mean is 3.7, Davis’ mean is 2.7, Maris’ mean is 4.6.
2. Terry’s median is three, Davis’ median is three. Maris’ median is four.
3. It appears that the median is always closest to the high point (the mode), while the mean tends to be farther out on the tail. In a symmetrical distribution, the mean and the median are both centrally located close to the high point of the distribution.

## Try it

Discuss the mean, median, and mode for each of the following problems. Is there a pattern between the shape and measure of the center?

a.

b.

The Ages Former U.S Presidents Died
4 6 9
5 3 6 7 7 7 8
6 0 0 3 3 4 4 5 6 7 7 7 8
7 0 1 1 2 3 4 7 8 8 9
8 0 1 3 5 8
9 0 0 3 3
Key: 8|0 means 80.

c.

1. mean = 4.25, median = 3.5, mode = 1; The mean>median>mode which indicates skewness to the right. (data are 0, 1, 2, 3, 4, 5, 6, 9, 10, 14 and respective frequencies are 2, 4, 3, 1, 2, 2, 2, 2, 1, 1)
2. mean = 70.1 , median = 68, mode = 57, 67 bimodal; the mean and median are close but there is a little skewness to the right which is influenced by the data being bimodal. (data are 46, 49, 53, 56, 57, 57, 57, 58, 60, 60, 63, 63, 64, 64, 65, 66, 67, 67, 67, 68, 70, 71, 71, 72, 73, 74, 77, 78, 78, 79, 80, 81, 83, 85, 88, 90, 90 93, 93).
3. These are estimates: mean =16.095, median = 17.495, mode = 22.495 (there may be no mode); The mean<median<mode which indicates skewness to the left. (data are the midponts of the intervals: 2.495, 7.495, 12.495, 17.495, 22.495 and respective frequencies are 2, 3, 4, 7, 9).

## Chapter review

Looking at the distribution of data can reveal a lot about the relationship between the mean, the median, and the mode. There are three types of distributions. A right (or positive) skewed distribution has a shape like [link] . A left (or negative) skewed distribution has a shape like [link] . A symmetrical distrubtion looks like [link] .

Use the following information to answer the next three exercises: State whether the data are symmetrical, skewed to the left, or skewed to the right.

• 1
• 1
• 1
• 2
• 2
• 2
• 2
• 3
• 3
• 3
• 3
• 3
• 3
• 3
• 3
• 4
• 4
• 4
• 5
• 5

The data are symmetrical. The median is 3 and the mean is 2.85. They are close, and the mode lies close to the middle of the data, so the data are symmetrical.

• 16
• 17
• 19
• 22
• 22
• 22
• 22
• 22
• 23

• 87
• 87
• 87
• 87
• 87
• 88
• 89
• 89
• 90
• 91

The data are skewed right. The median is 87.5 and the mean is 88.2. Even though they are close, the mode lies to the left of the middle of the data, and there are many more instances of 87 than any other number, so the data are skewed right.

When the data are skewed left, what is the typical relationship between the mean and median?

When the data are symmetrical, what is the typical relationship between the mean and median?

When the data are symmetrical, the mean and median are close or the same.

What word describes a distribution that has two modes?

Describe the shape of this distribution.

The distribution is skewed right because it looks pulled out to the right.

Describe the relationship between the mode and the median of this distribution.

Describe the relationship between the mean and the median of this distribution.

The mean is 4.1 and is slightly greater than the median, which is four.

Describe the shape of this distribution.

Describe the relationship between the mode and the median of this distribution.

The mode and the median are the same. In this case, they are both five.

Are the mean and the median the exact same in this distribution? Why or why not?

Describe the shape of this distribution.

The distribution is skewed left because it looks pulled out to the left.

Describe the relationship between the mode and the median of this distribution.

Describe the relationship between the mean and the median of this distribution.

The mean and the median are both six.

The mean and median for the data are the same.

• 3
• 4
• 5
• 5
• 6
• 6
• 6
• 6
• 7
• 7
• 7
• 7
• 7
• 7
• 7

Is the data perfectly symmetrical? Why or why not?

Which is the greatest, the mean, the mode, or the median of the data set?

• 11
• 11
• 12
• 12
• 12
• 12
• 13
• 15
• 17
• 22
• 22
• 22

The mode is 12, the median is 13.5, and the mean is 15.1. The mean is the largest.

Which is the least, the mean, the mode, and the median of the data set?

• 56
• 56
• 56
• 58
• 59
• 60
• 62
• 64
• 64
• 65
• 67

Of the three measures, which tends to reflect skewing the most, the mean, the mode, or the median? Why?

The mean tends to reflect skewing the most because it is affected the most by outliers.

In a perfectly symmetrical distribution, when would the mode be different from the mean and median?

The probability range is 0 to 1... but why we take it 0 to 1....
what do they mean in a question when you are asked to find P40 and P88
Mani
hi
Mehri
you're asked to find page 40 and page 88 on that particular book.
Joseph
hi
ravi
any suggestions for statistics app better than this
ravi
sorry miss wrote the question
omar
No problem) By the way. I NEED a program For statistical data analysis. Any suggestion?
Mani
Eviews will help u
Hello
Okonkwo
arey there any data analyst and working on sas statistical model building
ravi
Hi guys ,actually I have dicovered that the P40 and P88 means finding the 40th and 88th percentiles 😌..
Megrina
who can explain the euclidian distance
ravi
I am fresh student of statistics (BS) plz guide me best app or best website relative to stat topics
Noman
IMAGESNEWSVIDEOS A Dictionary of Computing. measures of location Quantities that represent the average or typical value of a random variable (compare measures of variation). They are either properties of a probability distribution or computed statistics of a sample. Three important measures are the mean, median, and mode.
define the measures of location
IMAGESNEWSVIDEOS A Dictionary of Computing. measures of location Quantities that represent the average or typical value of a random variable (compare measures of variation). They are either properties of a probability distribution or computed statistics of a sample. Three important measures are th
Ahmed
hi i have a question....
what is confidence interval estimate and its formula in getting it
discuss the roles of vital and health statistic in the planning of health service of the community
given that the probability of
BITRUS
can man city win Liverpool ?
There are two coins on a table. When both are flipped, one coin land on heads eith probability 0.5 while the other lands on head with probability 0.6. A coin is randomly selected from the table and flipped. (a) what is probability it lands on heads? (b) given that it lands on tail, what is the Condi
0.5*0.5+0.5*0.6
Ravasz
It should be a Machine learning terms。
Mok
it is a term used in linear regression
Saurav
what are the differences between standard deviation and variancs?
Enhance
what is statistics
statistics is the collection and interpretation of data
Enhance
the science of summarization and description of numerical facts
Enhance
Is the estimation of probability
Zaini
mr. zaini..can u tell me more clearly how to calculated pair t test
Haai
do you have MG Akarwal Statistics' book Zaini?
Enhance
Haai how r u?
Enhance
maybe .... mathematics is the science of simplification and statistics is the interpretation of such values and its implications.
Miguel
can we discuss about pair test
Haai
what is outlier?
outlier is an observation point that is distant from other observations.
Gidigah
what is its effect on mode?
Usama
Outlier  have little effect on the mode of a given set of data.
Gidigah
How can you identify a possible outlier(s) in a data set.
Daniel
The best visualisation method to identify the outlier is box and wisker method or boxplot diagram. The points which are located outside the max edge of wisker(both side) are considered as outlier.
Akash
@Daniel Adunkwah - Usually you can identify an outlier visually. They lie outside the observed pattern of the other data points, thus they're called outliers.
Ron
what is completeness?
I am new to this. I am trying to learn.
Dom
I am also new Dom, welcome!
Nthabi
thanks
Dom
please my friend i want same general points about statistics. say same thing
alex
outliers do not have effect on mode
Meselu
also new
yousaf
I don't get the example
ways of collecting data at least 10 and explain
Example of discrete variable
Gbenga
I am new here, can I get someone to guide up?
alayo
dies outcome is 1, 2, 3, 4, 5, 6 nothing come outside of it. it is an example of discrete variable
jainesh
continue variable is any value value between 0 to 1 it could be 4digit values eg 0.1, 0.21, 0.13, 0.623, 0.32
jainesh
hi
Kachalla
what's up here ... am new here
Kachalla
sorry question a bit unclear...do you mean how do you analyze quantitative data? If yes, it depends on the specific question(s) you set in the beginning as well as on the data you collected. So the method of data analysis will be dependent on the data collecter and questions asked.
Bheka
how to solve for degree of freedom
saliou
Quantitative data is the data in numeric form. For eg: Income of persons asked is 10,000. This data is quantitative data on the other hand data collected for either make or female is qualitative data.
Rohan
*male
Rohan
Degree of freedom is the unconditionality. For example if you have total number of observations n, and you have to calculate variance, obviously you will need mean for that. Here mean is a condition, without which you cannot calculate variance. Therefore degree of freedom for variance will be n-1.
Rohan
data that is best presented in categories like haircolor, food taste (good, bad, fair, terrible) constitutes qualitative data
Bheka
vegetation types (grasslands, forests etc) qualitative data
Bheka