Skip to content

Instantly share code, notes, and snippets.

@xtrmstep
Created December 28, 2024 14:42
Show Gist options
  • Save xtrmstep/853339ad2706810b3797cce7ed9bb159 to your computer and use it in GitHub Desktop.
Save xtrmstep/853339ad2706810b3797cce7ed9bb159 to your computer and use it in GitHub Desktop.
for article "Data Series Normalization Techniques" at Medium
Test Name Null Hypothesis p-value Criteria Limitations Use Cases
Shapiro-Wilk Test The data is normally distributed. If p > 0.05, fail to reject H₀; the data is likely normal. Sensitive to sample size; performance may degrade for large datasets (>5000 samples). General-purpose normality test for small to medium datasets. A widely used test that checks how well the data aligns with a normal distribution.
Kolmogorov-Smirnov Test (K-S Test) The data follows a specified distribution (e.g., normal). If p > 0.05, fail to reject H₀; the data likely matches the specified distribution. Sensitive to differences in the tails of distributions; less effective for small sample sizes. Testing goodness-of-fit to any distribution, including normality. Compares the empirical distribution of data to a reference distribution.
Anderson-Darling Test The data is normally distributed. Critical values provided; if test statistic < critical value, fail to reject H₀. Requires predefined significance levels; critical values are dataset-dependent. Assessing normality in specific use cases requiring stricter tests. Enhances the K-S Test by giving more weight to tails of the distribution.
D’Agostino and Pearson's Test The data is normally distributed. If p > 0.05, fail to reject H₀; the data is likely normal. Sensitive to skewness and kurtosis; less effective for small datasets (<20 samples). Situations requiring an assessment of skewness and kurtosis for normality. Combines skewness and kurtosis to test for normality, focusing on distribution shape.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment