# 12.4 Testing the significance of the correlation coefficient

 Page 1 / 6

The correlation coefficient, r , tells us about the strength and direction of the linear relationship between x and y . However, the reliability of the linear model also depends on how many observed data points are in the sample. We need to look at both the value of the correlation coefficient r and the sample size n , together.

We perform a hypothesis test of the "significance of the correlation coefficient" to decide whether the linear relationship in the sample data is strong enough to use to model the relationship in the population.

The sample data are used to compute r , the correlation coefficient for the sample. If we had data for the entire population, we could find the population correlation coefficient. But because we have only have sample data, we cannot calculate the population correlation coefficient. The sample correlation coefficient, r , is our estimate of the unknown population correlation coefficient.

• The symbol for the population correlation coefficient is ρ , the Greek letter "rho."
• ρ = population correlation coefficient (unknown)
• r = sample correlation coefficient (known; calculated from sample data)

The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is "close to zero" or "significantly different from zero". We decide this based on the sample correlation coefficient r and the sample size n .

## If the test concludes that the correlation coefficient is significantly different from zero, we say that the correlation coefficient is "significant."

• Conclusion: There is sufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero.
• What the conclusion means: There is a significant linear relationship between x and y . We can use the regression line to model the linear relationship between x and y in the population.

## If the test concludes that the correlation coefficient is not significantly different from zero (it is close to zero), we say that correlation coefficient is "not significant".

• Conclusion: "There is insufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is not significantly different from zero."
• What the conclusion means: There is not a significant linear relationship between x and y . Therefore, we CANNOT use the regression line to model a linear relationship between x and y in the population.

## Note

• If r is significant and the scatter plot shows a linear trend, the line can be used to predict the value of y for values of x that are within the domain of observed x values.
• If r is not significant OR if the scatter plot does not show a linear trend, the line should not be used for prediction.
• If r is significant and if the scatter plot shows a linear trend, the line may NOT be appropriate or reliable for prediction OUTSIDE the domain of observed x values in the data.

## Performing the hypothesis test

• Null Hypothesis: H 0 : ρ = 0
• Alternate Hypothesis: H a : ρ ≠ 0

## What the hypotheses mean in words:

change of origin and scale
3. If the grades of 40000 students in a course at the Hashemite University are distributed according to N(60,400) Then the number of students with grades less than 75 =*
If a constant value is added to every observation of data, then arithmetic mean is obtained by
sum of AM+Constnt
Fazal
data can be defined as numbers in context. suppose you are given the following set of numbers 18,22,22,20,19,21
what are data
what is mode?
what is statistics
Natasha
statistics is a combination of collect data summraize data analyiz data and interprete data
Ali
what is mode
Natasha
what is statistics
It is the science of analysing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample.
Bernice
history of statistics
statistics was first used by?
Terseer
if a population has a prevalence of Hypertension 5%, what is the probability of 4 people having hypertension from 8 randomly selected individuals?
Carpet land sales persons average 8000 per weekend sales Steve qantas the firm's vice president proposes a compensation plan with new selling incentives Steve hopes that the results of a trial selling period will enable him to conclude that the compensation plan increases the average sales per sales
Supposed we have Standard deviation 1.56, mean 6.36, sample size 25 and Z-score 1.96 at 95% confidence level, what is the confidence interval?
if Y=a+bX and X=c+dY the show that |r|= √hd where r is regression coefficient
this is a linear function. I presume this will be solved simultaneously?
no
Naheed
how can I get esyer statistic?
yes
pakistan
msc completed
i am bba students at nfc
Hamdan
stat. is the subject in bba .... exam is online . .. which fee u charge to slove my exam ?
Hamdan
which uni u completed msc?
Hamdan
no charges. i am just helping you. not for fees
really
Hamdan
yeap
i am so glad this type of people lived in pakistan💔
Hamdan
but unfortunately bba students just live for money🤣
Hamdan
no purpose of life without money🤠
Hamdan
money is not everything
Hamdan
ok ye tu easy topic h bhot
main tume is se related ques aur theory bejti ho
in an examination 60% passed in physics 52% passed in statistics. while 32% failed in both the subject's using relations between class frequencies in attributes find the percentage of student passed in both the subject's
Satish
Hamdan
apply
Hamdan
Find the multiple regression line Y on X1 and X2 from the following data, also interpret the estimated parameters and calculate standard error of estimate. Y 90 72 54 42 30 12 10 X1
how old are the children needed
?
Afolabi