Skip to content

Instantly share code, notes, and snippets.

@yjunechoe
Last active February 10, 2025 02:32
Show Gist options
  • Save yjunechoe/3090c41a10d7cd9b5b4026d39ee7d5de to your computer and use it in GitHub Desktop.
Save yjunechoe/3090c41a10d7cd9b5b4026d39ee7d5de to your computer and use it in GitHub Desktop.
If you z-score the response variable by group, will by-group intercept always be zero? (no - depends on meaning of intercept - ex: in dummy coding)
library(tidyverse)
set.seed(123)
df <- crossing(
subject = 1:10,
prime = c("yes", "no")
) %>%
mutate(
intercept = 500,
beta_prime = -100,
) %>%
mutate(
subject_intercept = rep(rnorm(n = 1, sd = 200), 2),
subject_slope = rep(rnorm(n = 1, sd = 100), 2),
.by = subject
) %>%
reframe(
item = 1:30,
.by = everything()
) %>%
mutate(
noise = rnorm(n(), sd = 30),
RT = intercept + (beta_prime * (prime == "yes")) +
subject_intercept + (subject_slope * (prime == "yes")) +
noise
) %>%
mutate(
RT_z = scale(RT)[,1],
.by = subject
)
df %>%
summarize(mean_RT_z = mean(RT_z), .by = c(subject, prime)) %>%
pivot_wider(names_from = prime, values_from = mean_RT_z)
# raw RT plot by subject
df %>%
ggplot(aes(RT, fill = prime)) +
geom_density(alpha = .5) +
# intercept (prime=no) line
geom_vline(
aes(xintercept = mean_primeno),
color = "black", linetype = "dashed",
data = . %>%
filter(prime == "no") %>%
summarize(mean_primeno = mean(RT))
) +
# individual intercept (prime=no) line
geom_vline(
aes(xintercept = mean_primeno),
color = "red",
data = . %>%
filter(prime == "no") %>%
summarize(mean_primeno = mean(RT), .by = subject)
) +
facet_wrap(~ subject, axes = "all_x")
# z-scored RT plot by subject
df %>%
ggplot(aes(RT_z, fill = prime)) +
geom_density(alpha = .5) +
# intercept (prime=no) line
geom_vline(
aes(xintercept = mean_primeno),
color = "black", linetype = "dashed",
data = . %>%
filter(prime == "no") %>%
summarize(mean_primeno = mean(RT_z))
) +
# individual intercept (prime=no) line
geom_vline(
aes(xintercept = mean_primeno),
color = "red",
data = . %>%
filter(prime == "no") %>%
summarize(mean_primeno = mean(RT_z), .by = subject)
) +
facet_wrap(~ subject, axes = "all_x")
# by-subject intercept == variation in mean z-scored RT when prime=="no" (assuming dummy coding)
# Subject intercept ~= .6
df %>%
filter(prime == "no") %>%
summarize(mean_noprime_RT_z = mean(RT_z), .by = subject) %>%
summarize(sd(mean_noprime_RT_z))
library(lme4)
mod <- lmer(
RT_z ~ 1 + prime_yes + (1 + prime_yes || subject),
df %>%
# Explicit numeric coding to help remove the corr term
mutate(prime_yes = as.integer(prime == "yes"))
)
# Subject intercept ~= .577
VarCorr(mod)
@yjunechoe
Copy link
Author

Raw RT subject intercept (prime=no) variation:

image

z-scored RT subject intercept (prime=no) variation (still variation!):

image

@yjunechoe
Copy link
Author

Note: for the exact same data, singular fit in by-subject intercept if the predictor is sum-coded (= intercept represents grand mean = zero from z-scoring):

lmer(
  RT_z ~ 1 + prime_sumcoded + (1 + prime_sumcoded || subject),
  df %>% 
    mutate(prime_sumcoded = ifelse(prime == "yes", .5, -.5))
) |>
  VarCorr()
#> boundary (singular) fit: see help('isSingular')
#>  Groups    Name           Std.Dev.
#>  subject   (Intercept)    0.00000 
#>  subject.1 prime_sumcoded 1.19100 
#>  Residual                 0.64684

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment