Statistical hypothesis tests help us determine if observed data significantly differs from what we expect. These tests, like Z-tests and T-tests, are essential tools in probability and statistics for making informed decisions based on data analysis.
-
Z-test
- Used to determine if there is a significant difference between sample and population means when the population variance is known.
- Applicable for large sample sizes (n > 30) or when the population is normally distributed.
- Assumes that the data is continuous and follows a normal distribution.
-
T-test (one-sample, two-sample, paired)
- One-sample T-test: Compares the mean of a single sample to a known population mean.
- Two-sample T-test: Compares the means of two independent samples to see if they are significantly different.
- Paired T-test: Compares means from the same group at different times (e.g., before and after treatment).
-
Chi-square test
- Tests the association between categorical variables by comparing observed frequencies to expected frequencies.
- Commonly used in contingency tables to assess independence.
- Requires a minimum sample size and expected frequency in each category.
-
F-test
- Used to compare the variances of two populations to determine if they are significantly different.
- Often used in the context of ANOVA to test the equality of variances.
- Assumes that the data is normally distributed and independent.
-
ANOVA (Analysis of Variance)
- Compares means across three or more groups to determine if at least one group mean is different.
- Can be one-way (one independent variable) or two-way (two independent variables).
- Assumes normality, independence, and homogeneity of variances.
-
Regression analysis
- Examines the relationship between a dependent variable and one or more independent variables.
- Can be simple (one independent variable) or multiple (more than one independent variable).
- Helps in predicting outcomes and understanding relationships between variables.
-
Wilcoxon rank-sum test
- A non-parametric test that compares the ranks of two independent samples to assess whether their population distributions differ.
- Used when the assumptions of the T-test are not met (e.g., non-normal data).
- Suitable for ordinal data or continuous data that do not meet normality assumptions.
-
Kruskal-Wallis test
- A non-parametric alternative to ANOVA for comparing three or more independent groups.
- Assesses whether the samples originate from the same distribution.
- Useful when the assumptions of ANOVA are violated, such as non-normality.
-
Shapiro-Wilk test
- Tests the null hypothesis that a sample comes from a normally distributed population.
- Commonly used to assess normality before applying parametric tests.
- Sensitive to sample size; small samples may not provide reliable results.
-
Kolmogorov-Smirnov test
- Compares the empirical distribution function of a sample with a reference probability distribution (e.g., normal distribution).
- Can be used for one-sample or two-sample tests to assess the goodness of fit.
- Non-parametric and does not assume a specific distribution for the data.