library(dplyr)
library(ggplot2)
library(ggh4x)
library(geomtextpath)
set.seed(12345)
N <- 150
data <- data.frame(response = c(rnorm(N, mean = 1, sd = 1),
You probably heard about the Anscombe's quartet. It's almost a textbook justification for looking at the data first and not trusting solely descriptive statistics!
I decided to make my own, Olszewski's quartet! It shows 4 faces in different moods. The mean and variance of the Y coordinate is exactly (NOT approximately!) the same for all 4 faces. Also, the Pearson's correlation is almost 0.
Despite the widespread and nonsensical claim, that "logistic regression is not a regression", it constitutes one of the key regression and hypothesis testing tools used in the experimental research (like clinical trials).
Let me show you how the logistic regression (with a few extensions) can be used to test hypotheses about fractions (%) of successes, repacling the classic "test for proportions". Namely, it can replicate the results of:
- the Wald's (normal approximation) z test for 2 proportions with non-pooled standard errors (common in clinical trials) via LS-means on the prediction scale or AME (average marginal effect)
- the Rao's score (normal appr.) z test for 2 proportions with pooled standard errors (just what the
prop.test()
does in R) - the z test for multiple (2+) proportions
- ANOVA-like (joint) test for multiple caterogical predictors (n-way ANOVA). Also (n-way) ANCOVA if you employ numerical covariates.
- [the **Cochran-Mantel-Haenszel
# Let's make some data to play with | |
set.seed(1234) | |
v1 <- rexp(500) | |
v2 <- rnorm(500) + log(2) | |
v3 <- -rgamma(500, 2.5, 3) | |
v4 <- runif(500, -2,4) | |
# Look at the data | |
layout(matrix(c(1:4), nrow = 2)) |