What is Hypothesis Testing?
Statistics is all about data. That huge amount of data will only be useful if we are going to analyze it or take out conclusions from it. To find out such important interpretations or conclusions we use hypothesis testing.
Statistics |
Statistical Hypothesis testing is to test the assumption (hypothesis) made and draw a conclusion about the population. This is done by testing the sample representing the whole population and based on the results obtained; the hypothesis is either rejected or accepted.
Pre-requisites: DSL | E1 Statistics
Steps in Hypothesis Testing
The three major steps are:
- Making an initial assumption.
- We will take the initial assumption as the null hypothesis, H0.
- Example:
- We want to know whether the defendant is guilty or innocent.
- Thus, we take H0 as Defendant not guilty.
- Collecting the data.
- This data will not only be the data but the shreds of evidence as well.
- Example: Fingerprints, DNA, etc.
- Gathering evidence to reject/accept the hypothesis.
- Example:
- If H0 is true -> Null Hypothesis
- If H0 false -> Alternate Hypothesis (H1)
Dividing the process broadly, it consists of seven steps as below:
Steps in Hypothesis Testing |
Null & Alternate Hypothesis
The null and alternative hypothesis is represented by H0 and H1 respectively.
Hypothesis 0 (H0): It is an assumption made about the population which needs to be tested and is considered to be true until evidence is found against it
Hypothesis 1 (H1): It is the opposite of the assumption made and is accepted when the former is rejected.
Confusion Matrix
|
H0 |
H1 |
Accept |
OK |
Type II Error |
Reject |
Type I Error |
OK |
Scenario-1: Suppose due to some reasons, we know that H0 is true but we do not have enough evidence. In this case, H1 will fall true even if H0 is actually true. Hence, this is called Type 1 Error.
Scenario-2: Let's say, H0 = Market will crash. H1 = Market will not crash. However, we do not have enough evidence to state that H1 is true. In this case, H0 will fall true. Hence, this is called Type 2 Error.
Thus, both types of error play a major role in Hypothesis Testing. Also, it depends on which error has more significance in different Machine Learning algorithms.
Significance level, P-value & Confidence Level
The significance level is represented by the Greek letter alpha (α).
The common values used for alpha is 0.1%, 1%, 5%and 10%. A smaller alpha value suggests a more robust or strong interpretation of the null hypothesis, such as 1% or 0.1%.
The hypothesis test returns a probability value known as a p-value. Using this value we can either reject the null hypothesis and accept the alternate hypothesis or accept the null hypothesis.
- p-value = Probability (Data | Null Hypothesis)
- p-value <= α: Reject the null and accept the alternate hypothesis
- p-value > α: Failed to reject the null hypothesis
Let us experiment on the above hypothesis by flipping a single coin five times.
Experiment Performed:
- After flipping the coin five times, we got five heads in a row (X) = 5
- Considering alpha = 0.05
- p-value :probability(X=5 | Ho)
Result:
- No. of events in possible outcomes with all five heads = 1
- So, P(X=5 | Ho) = 1/32 = 0.03
- 0.03 signifies that there is only a 3% chance of getting all five heads in a row which is less than alpha.
- P(X= 5 | Ho) = 0.03 < alpha (0.05)
- As the ground truth observed cannot be rejected, hence the null hypothesis(Ho) is rejected, and the alternate hypothesis is accepted.
- Confidence level = 1 — significance level (α)
For more information on p-value refer to:
https://www.analyticsvidhya.com/blog/2019/09/everything-know-about-p-value-from-scratch-data-science/
Types of tests and when to use which?
We will be focusing on concepts for now about t-test, chi-square, and ANOVA.
Gender |
Age Group |
Weight (kgs) |
Height (m) |
M |
Elder |
70 |
1.4 |
F |
Adult |
65 |
1.2 |
M |
Adult |
65 |
1.4 |
M |
Child |
20 |
1 |
F |
Adult |
75 |
1.3 |
M |
Elder |
80 |
1.4 |
Scenario-1:
Considering the first column of the table i.e. Gender and it is a categorical value. Our question would be: Is there any difference between the proportion of Male & females?
Note: H0 is always true initially.
H1 = There is a difference.
Now, when we create a bar-plot for M & F, we can see if there is any difference or not. However, it is ONLY for sample data and NOT for the whole population data. And hence, we must consider H0.
H0 = No difference.
Now, to apply the test, we say that we have H0 which is true, so what is the 'likelihood' that H1 will be true?
As we saw the Gaussian Distribution above, we randomly assume for now that our p-value <= 0.05, and as it is one categorical feature, we will apply 'One sample proportion test'. p-value needs to be determined before we select which test to perform. We want our test to fall in the Confidence Internal. If it is in the Rejection Region (here, p <= 0.05) then we reject our Null hypothesis H0 and accept H1.
Conclusion: Difference Exists.
Scenario-2:
We take two categorical features: Gender and Age group. And the question is: Is there any difference between the proportion of Male & females based on Age group?
H0 = No difference.
H1 = Difference exists.
Test = Chi-square test.
If p <= 0.05, H1 is accepted.
Scenario-3:
We take one numeric continuous variable: Height. And the question is: Based on the previous sample, is there a difference w.r.t. the mean height?
H0 = No difference.
H1 = Difference exists.
Test = t-test.
If p <= 0.05, H1 is accepted.
Scenario-4:
We take two numeric continuous variables: Height and Weight. And the question is: Based on the previous sample, is there a difference w.r.t. the mean height based on mean weight?
H0 = No difference.
H1 = Difference exists.
Test = Co-relation.
If p <= 0.05, H1 is accepted.
Scenario-5:
Consider any one numeric feature and two/more categorical features. And in individual categorical features if there exist more than two sub-categories, then we will use the ANOVA test.
Reach me on LinkedIn
Comments
Post a Comment