
Epidemiology Biostatistics Descriptive Statistics
Author:
Dr.Janet ForresterProfessor
Tufts University School of Medicine
USA
Access: 
1.1 Definitions of statistics, probability, and key terms Read Online
1.2 Data, sampling, and variation in data and sampling Read Online
1.3 Frequency, frequency tables, and levels of measurement Read Online
1.4 Experimental design and ethics Read Online
By the end of this chapter, the student should be able to:
You are probably asking yourself the question, "When and where will I use statistics?" If you read any newspaper, watch television, or use the Internet, you will see statistical information. There are statistics about crime, sports, education, politics, and real estate. Typically, when you read a newspaper article or watch a television news program, you are given sample information. With this information, you may make a decision about the correctness of a statement, claim, or "fact." Statistical methods can help you make the "best educated guess."
Since you will undoubtedly be given statistical information at some point in your life, you need to know some techniques for analyzing the information thoughtfully. Think about buying a house or managing a budget. Think about your chosen profession. The fields of economics, business, psychology, education, biology, law, computer science, police science, and early childhood development require at least one course in statistics.
Included in this chapter are the basic ideas and words of probability and statistics. You will soon understand that statistics and probability work together. You will also learn how data are gathered and what "good" data can be distinguished from "bad."
We will teach you how to read and critique medical journal articles using examples from some of the most widelyread medical journals. To critique the medical literature you will need to understand the fundamentals of epidemiologic study design, the sources of bias, and the role of chance. Every discipline has its own jargon. we will cover the terminology used in clinical research, including the basic statistical jargon. The most important concepts are in the lectures and small groups provide you with an opportunity to apply what you have learned from the lecture material to actual medical journal articles.
As future physicians you have an obligation to remain current in your field of practice and to treat patients according to generally accepted standards of care.
Question: In that same study, systolic blood pressure is collected and categorized into one of the following groups: < 120, 120?140, > 140. What type of data are these?
Choices:
nominal
ordinal
discrete
continuous
block
Question: What measure would be the most appropriate to measure the central tendency of a distribution that has a lot of outliers?
Choices:
The median would be the ideal measure of central tendency, since it is robust to outliers.
Question: If the mean number of doughnut holes consumed by each customer is 6.5, with a standard deviation of 2.5, and if we assume that the number of doughnut holes eaten follows a normal distribution, we would expect 68% of customers to have eaten how many doughnut holes?
Choices:
4 to 9
2.5 to 11.5
1 to 14
6.5 to 15
Cannot be determined from the information given. You need to know the sample size.
Question: The following scenario is applied to questions 6 and 7: You are a pediatrician who only sees patients who are 6 years old and have red hair. The distribution of the height of 6 year olds is approximately normal with a mean of 40 inches and a standard deviation of 3 inches. expected to have heights between 37 and 43 inches?
Choices:
37 and 43 are one standard deviation above and below the mean of 40. By the 68?95?99 rule described in the notes, we would expect about 68%, or 68 out of 100 patients to fall within one standard deviation unit of the mean.
Question: Which of the following statements is CORRECT concerning measures of central tendency and variability?
Choices:
Medians are not robust to outliers, while means are robust to outliers.
Means and medians can never be negative.
It is common to see the variance of a continuous variable displayed in journal articles along with means, since the variance is a measure of dispersion in the original units measured.
The median of the following data: 3, 5, 10, 11; is 7.5.
The interquartile range is the difference between the 50th and 25th percentiles.
Question: The following scenario is applied to questions 6 and 7: You are a pediatrician who only sees patients who are 6 years old and have red hair. The distribution of the height of 6 year olds is approximately normal with a mean of 40 inches and a standard deviation of 3 inches. What percentile is Billy compared to all other 6 year olds?
Choices:
Billy is 2 standard deviations above the mean. Therefore, according to the distribution of a theoretical normal distribution, he is at the 97.5th percentile. In other words, 97.5% of all 6?year olds would have heights lower than Billy?fs.
Question: What would be the best way to display the association between two continuous variables, BMI and age?
Choices:
boxplot
scatterplot
pie chart
histogram
bar chart
Question: In a study, racial and ethnic data are collected for each study participant. Each study participant is classified into one of the following categories: White, Black, Asian, Native American, Native Hawaiian, Pacific Islander, or other. What type of variable is race/ethnicity?
Choices:
nominal
ordinal
discrete
continuous
block