P values are used to measure the significance of the observed data. We also use p values to measure the correlation of two variables with each other. For example, in a linear correlation model, we can use p values to determine the relevance of the features included in the model to the target variable or the relevance of independent variables to each other.
In inferential statistics, the null hypothesis is a general proposition that accepts that there is nothing out of the ordinary. For example,
- there is no relationship between groups or variables, or
- there is no difference between the two measured phenomena.
Suppose we are conducting an experiment on a new drug that is claimed to increase IQ. We randomly sample a collection of volunteers and randomly assign them to two groups, Group A & B.
Group A is the control group and takes the placebo (no active ingredients) and B takes the actual pill.
In this case, the null hypothesis would be — “There is no difference between groups. That is, the drug has not potential benefit.”
We measure the IQs of all the volunteers and start giving the pills. After a certain period of time, we re-measure the participants’ IQs.
Suppose, we do not observe any difference from in group A subjects (placebo group). But, there is an increase of 5 points in the subjects in Group B.
Null Hypothesis: No difference.
Alternate Hypothesis: YES.
The question before us is — “Is the difference statistically significant. Or it is just fluke.”
p-value measures the strength of evidence against the null
hypothesis. The smaller the p-value, the stronger the evidence against the null hypothesis.
To determine the p-value, statistical hypothesis testing methods
Statistical significance tests help determine whether observed differences or patterns in data are likely due to real effects or by random chance.