Adrian Olszewski adrianolszewski

library(dplyr)
library(ggplot2)
library(ggh4x)
library(geomtextpath)

set.seed(12345)

N <- 150
data <- data.frame(response = c(rnorm(N, mean = 1, sd = 1),

Olszewski's vs. Anscombe's quartet

You probably heard about the Anscombe's quartet. It's almost a textbook justification for looking at the data first and not trusting solely descriptive statistics!

I decided to make my own, Olszewski's quartet! It shows 4 faces in different moods. The mean and variance of the Y coordinate is exactly (NOT approximately!) the same for all 4 faces. Also, the Pearson's correlation is almost 0.

How did I make it?

Despite the widespread and nonsensical claim, that "logistic regression is not a regression", it constitutes one of the key regression and hypothesis testing tools used in the experimental research (like clinical trials).

Let me show you how the logistic regression (with a few extensions) can be used to test hypotheses about fractions (%) of successes, repacling the classic "test for proportions". Namely, it can replicate the results of:

the Wald's (normal approximation) z test for 2 proportions with non-pooled standard errors (common in clinical trials) via LS-means on the prediction scale or AME (average marginal effect)
the Rao's score (normal appr.) z test for 2 proportions with pooled standard errors (just what the prop.test() does in R)
the z test for multiple (2+) proportions
ANOVA-like (joint) test for multiple caterogical predictors (n-way ANOVA). Also (n-way) ANCOVA if you employ numerical covariates.
[the **Cochran-Mantel-Haenszel

	# Let's make some data to play with

	set.seed(1234)
	v1 <- rexp(500)
	v2 <- rnorm(500) + log(2)
	v3 <- -rgamma(500, 2.5, 3)
	v4 <- runif(500, -2,4)

	# Look at the data
	layout(matrix(c(1:4), nrow = 2))