# Estimating Marginal Means and Pairwise Tests Manually in Julia

Worked examples of estimating marginal means and conducting pairwise tests for mixed effects models (including random effects and unbalanced data) using matrix multiplication in Julia.
Julia
statistics
emmeans
pairwise tests
mixed effects models
Author

Glenn Williams

Published

July 4, 2024

This is the third in a series of blog posts working through how to generate standard errors for estimated marginal means and pooled standard errors for pairwise tests of these estimates from mixed effects models. The first shows how this can be done in R by hand for models fitted with lme4, largely replicating results provided by the excellent emmeans package. The second shows how this can be done in Python by hand for models fitted with statsmodels. This post aims to show how this can be done in Julia, while the next post will show how this can be done in Julia. This was motivated primarily by exploring methods for fitting mixed effects models in Python and Julia and finding no suitable alternative to emmeans in R for estimating marginal effects and conducting pairwise tests.

While in Julia we can use RCall to interface with R to perform model fitting with lme4 and follow-up tests using emmeans at the time of inception there were no Julia packages that recreate the functionality of emmeans for mixed effects models fitted in Julia. Since then, the Effects.jl package has been released which allows you to calculate estimated marginal means and pairwise tests by defining a reference grid. However, this doesn’t fully recreate the functionality of emmeans and understanding how to do so ourselves in Julia may be informative. So, we’ll do it again by hand.

Being able to calculate estimated marginal means in Julia directly is important as the MixedModels package for fitting multilevel models is a lot faster than lme4 and suffers from fewer issues with convergence. This might be especially important if one were to perform simulation-based power analysis where model complexity and the number of models might be high. Here, any improvement in model fitting speed is nice to have.

This post will therefore work through fitting a mixed effects model from a factorial design in Julia using the MixedModels.jl package before calculating estimated marginal means and conducting pairwise tests by hand.

# Getting Started

Here’s a quick breakdown of the packages used in each program, and why they are used:

• lme4: used for fitting mixed effects models in R.
• afex: used to obtain the data set with categorical predictors.
• emmeans: used for generating estimated marginal means and pairwise tests between group estimates in R.
• dplyr (as always!): used for general data cleaning.
library(lme4)
library(afex)
library(emmeans)
library(dplyr)
library(JuliaCall)

Since this blog post is created using knitr as the engine from R, I need to set up JuliaCall so I can pass data from R to Julia. Let’s get this set up.

j <- julia_setup()

From Julia we’ll need a number of packages, with their breakdown as follows:

• RCall: used for interfacing between Julia and R.
• DataFrames: unsurprisingly used to work with data frames.
• CSV: used for loading .csv files.
• Query: used to filter, join, sort and group data.
• MixedModels: used for fitting linear mixed effects models.
• StatsModels: needed to use the formula notation in defining a mixed effects model.
• LinearAlgebra: used for matrix operations, like getting the diagnonal of a variance-covariance matrix.
• Combinatorics: for generating combinations of pairwise contrasts.
• Distributions: used for working with distributions such as the Normal distribution for generating confidence intervals.
using DataFrames, CSV, MixedModels, Query, StatsModels, LinearAlgebra, Combinatorics, Distributions

We’ll first clean the data in R to show that the data set is consistent across both R and Julia methods.

fhch <- get(data("fhch2010")) |>   # load data from afex
filter(correct == TRUE) |>  # subset the data to correct answers
select(id, task, stimulus, item, rt, log_rt) # keep only used cols

We’ll then make the data accessible in Julia.

julia_assign("dat", fhch) # grab data from r

Then we can view the data in Julia to see what it looks like.

dat
## 12960×6 DataFrame
##    Row │ id    task    stimulus  item    rt       log_rt
##        │ Cat…  Cat…    Cat…      Cat…    Float64  Float64
## ───────┼─────────────────────────────────────────────────────
##      1 │ N1    naming  word      potted    1.091   0.0870947
##      2 │ N1    naming  word      engine    0.876  -0.132389
##      3 │ N1    naming  word      ideal     0.71   -0.34249
##      4 │ N1    naming  nonword   uares     1.21    0.19062
##      5 │ N1    naming  nonword   xazz      0.843  -0.170788
##      6 │ N1    naming  word      fill      0.785  -0.242072
##      7 │ N1    naming  nonword   bounge    0.662  -0.41249
##      8 │ N1    naming  nonword   psems     0.713  -0.338274
##    ⋮   │  ⋮      ⋮        ⋮        ⋮        ⋮         ⋮
##  12954 │ L28   lexdec  word      isle      1.515   0.415415
##  12955 │ L28   lexdec  word      reins     1.844   0.611937
##  12956 │ L28   lexdec  word      idiom     1.688   0.523544
##  12957 │ L28   lexdec  word      lawns     1.563   0.446607
##  12958 │ L28   lexdec  word      spiced    1.641   0.495306
##  12959 │ L28   lexdec  word      weighs    1.797   0.586119
##  12960 │ L28   lexdec  word      darted    1.531   0.425921
##                                            12945 rows omitted

# Fitting the Models

Before we fit our model, we first have to define a dictionary of contrasts. For consistency with the R and Python walkthroughs, we’ll use dummy coding throughout. As noted in the StatsModels.jl documentation on contrasts, while “the default in R and SPSS is to use the last level as the base. Here we use the first level as the base.” For further consistency with the results from the model fitting in the previous posts, we’ll manually set the base as the last level in each factor.

Typically, for factorial designs where ANOVA-like interpretations are desired, EffectsCoding() [AKA sum coding], or deviation coding should be used to ensure you obtain main effects and their interactions. This relies on using the EffectsCoding() function from StatsModels.

const contrasts = Dict(
:task => EffectsCoding(base = "naming", levels = ["naming", "lexdec"]),
:stimulus => EffectsCoding(base = "word", levels = ["word", "nonword"]),
:id => EffectsCoding(),
:item => EffectsCoding()
);

However, for consistency with the previous posts we’ll use dummy coding like so.

const contrasts = Dict(
:task => DummyCoding(base = "naming", levels = ["naming", "lexdec"]),
:stimulus => DummyCoding(base = "word", levels = ["word", "nonword"]),
:id => MixedModels.Grouping(),
:item => MixedModels.Grouping()
);

We can then fit our model in MixedModels. This has fixed effects of task, stimulus, and their interaction along with random intercepts and slopes of stimulus by participant (id) and random intercepts and slopes of task by item. For R users, the fitting process is very similar to that of lme4 but with the minor difference of needing to formally define the formula using the @formula() notation rather than as a bare formula and having to wrap it all up in a fit() function to actually fit the model you’ve defined.

mod = fit(
MixedModel,
@formula(
log_rt ~ 1 + task * stimulus +
(1 + stimulus | id) +
(1 + task | item)
),
dat,
contrasts = contrasts
)
## Linear mixed model fit by maximum likelihood
##  log_rt ~ 1 + task + stimulus + task & stimulus + (1 + stimulus | id) + (1 + task | item)
##    logLik   -2 logLik     AIC       AICc        BIC
##  -3438.9542  6877.9084  6899.9084  6899.9287  6982.0742
##
## Variance components:
##                Column       Variance  Std.Dev.   Corr.
## item     (Intercept)        0.0085461 0.0924449
##          task: lexdec       0.0267998 0.1637064 -0.78
## id       (Intercept)        0.0343237 0.1852666
##          stimulus: nonword  0.0075800 0.0870631 -0.02
## Residual                    0.0905882 0.3009787
##  Number of obs: 12960; levels of grouping factors: 600, 45
##
##   Fixed-effects parameters:
## ────────────────────────────────────────────────────────────────────────
##                                       Coef.  Std. Error      z  Pr(>|z|)
## ────────────────────────────────────────────────────────────────────────
## (Intercept)                       -0.352442   0.0421441  -8.36    <1e-16
## task: lexdec                       0.343417   0.0568873   6.04    <1e-08
## stimulus: nonword                  0.312304   0.0223443  13.98    <1e-43
## task: lexdec & stimulus: nonword  -0.255031   0.0312596  -8.16    <1e-15
## ────────────────────────────────────────────────────────────────────────

How does the model look in R?

m <- lme4::lmer(
log_rt ~ 1 + task * stimulus +
(1 + stimulus | id) +
(1 + task | item),
data = fhch
)
summary(m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: log_rt ~ 1 + task * stimulus + (1 + stimulus | id) + (1 + task |
##     item)
##    Data: fhch
##
## REML criterion at convergence: 6899
##
## Scaled residuals:
##    Min     1Q Median     3Q    Max
## -6.087 -0.593 -0.105  0.471  5.573
##
## Random effects:
##  Groups   Name            Variance Std.Dev. Corr
##  item     (Intercept)     0.00855  0.0925
##           tasklexdec      0.02683  0.1638   -0.78
##  id       (Intercept)     0.03592  0.1895
##           stimulusnonword 0.00793  0.0891   -0.02
##  Residual                 0.09059  0.3010
## Number of obs: 12960, groups:  item, 600; id, 45
##
## Fixed effects:
##                            Estimate Std. Error t value
## (Intercept)                 -0.3524     0.0431   -8.18
## tasklexdec                   0.3434     0.0581    5.91
## stimulusnonword              0.3123     0.0227   13.73
## tasklexdec:stimulusnonword  -0.2550     0.0318   -8.03
##
## Correlation of Fixed Effects:
##             (Intr) tsklxd stmlsn
## tasklexdec  -0.745
## stimlsnnwrd -0.078  0.066
## tsklxdc:stm  0.063 -0.095 -0.745

Those results look pretty similar with some minor differences in the estimates and standard errors likely due to different algorithms being used for the model fitting process.

# Estimating Marginal Means

First, we need to set up a grid allowing us to see the unique combinations for the levels of each factor. We’ll to this by extracting the unique values in each column of our data set and then combining them by taking their product. This is then defined as a data frame we’ve named grid. Note here that we’ve had to define an empty column named y which we use as a placeholder for the outcome. This is just assigned with lots of zeroes. The reason for doing so is that we’re next going to define a new model formula we’ll use to define a model matrix that we can use to estimate the marginal means and their standard errors. Every formula in Julia needs an outcome, so we must insert this y column to allow for these downstream calculations.

begin
task = unique(dat[:, :task])
stimulus = unique(dat[:, :stimulus])
grid = DataFrame(Iterators.product(task, stimulus))
rename!(grid,[:task,:stimulus])
insertcols!(grid, 3, :y => zeros(4))
end
## 4×3 DataFrame
##  Row │ task    stimulus  y
##      │ String  String    Float64
## ─────┼───────────────────────────
##    1 │ naming  word          0.0
##    2 │ lexdec  word          0.0
##    3 │ naming  nonword       0.0
##    4 │ lexdec  nonword       0.0

Next, we’ll define a new formula. In this case since we want the estimated marginal means for the fixed effects we only need to define the fixed effects as the predictors. We can use y here as our placeholder for an outcome.

f = @formula(y ~ 1 + task * stimulus)
## FormulaTerm
## Response:
##   y(unknown)
## Predictors:
##   1
##   task(unknown)
##   stimulus(unknown)
##   task(unknown) & stimulus(unknown)

Next, we’ll define a model frame using the ModelFrame() function. We pass to this function our new formula (of just fixed effects), the grid of unique values, and the contrasts we used when fitting the model. This last argument is important as the deign matrix will be different if we use a different contrast coding system.

mf = ModelFrame(f, grid, contrasts = contrasts)
## ModelFrame{NamedTuple{(:y, :task, :stimulus), Tuple{Vector{Float64}, Vector{String}, Vector{String}}}, StatisticalModel}(y ~ 1 + task + stimulus + task & stimulus, StatsModels.Schema with 3 entries:
##   stimulus => stimulus
##   task => task
##   y => y
## , (y = [0.0, 0.0, 0.0, 0.0], task = ["naming", "lexdec", "naming", "lexdec"], stimulus = ["word", "word", "nonword", "nonword"]), StatisticalModel)

Now that we have our model frame we can finally define the model matrix by passing this input to the ModelMatrix() function.

mm = ModelMatrix(mf)
## ModelMatrix{Matrix{Float64}}([1.0 0.0 0.0 0.0; 1.0 1.0 0.0 0.0; 1.0 0.0 1.0 0.0; 1.0 1.0 1.0 1.0], [1, 2, 3, 4])

Now we can start with the process of calculating our estimated marginal means. To do this, we need the fixed effect parameter estimates, or betas.

betas = fixef(mod)
## 4-element Vector{Float64}:
##  -0.352441738457094
##   0.34341675691748685
##   0.312304497311043
##  -0.25503093565175144

Then using some matrix multiplication from our grid and fixed effect parameter estimates we can calculate our estimated marginal means. Note here that we use mm.m to access the underlying matrix of the model matrix object mm. Matrix multiplication in Julia is simple, requiring just the * operator. While we’re at it we can drop the y placeholder column for the outcome as it’s no longer needed.

grid[:, :emmean] = mm.m * betas
## 4-element Vector{Float64}:
##  -0.352441738457094
##  -0.009024981539607124
##  -0.040137241146050995
##   0.048248580119684414
grid = select!(grid, Not([:y]))
## 4×3 DataFrame
##  Row │ task    stimulus  emmean
##      │ String  String    Float64
## ─────┼───────────────────────────────
##    1 │ naming  word      -0.352442
##    2 │ lexdec  word      -0.00902498
##    3 │ naming  nonword   -0.0401372
##    4 │ lexdec  nonword    0.0482486

Then we will generate the standard errors by pulling out the variance covariance matrix.

v = vcov(mod)
## 4×4 Matrix{Float64}:
##   0.00177612  -0.00178675   -7.51705e-5    8.57959e-5
##  -0.00178675   0.00323617    8.57959e-5   -0.000174455
##  -7.51705e-5   8.57959e-5    0.000499266  -0.0005205
##   8.57959e-5  -0.000174455  -0.0005205     0.000977165

We will use this to calculate the standard errors by getting the square root of the diagnonal of the model matrix multiplied by the variance covariance matrix multiplied by the transpose of the model matrix and taking its diagonal.

grid[:, :SE] = sqrt(Diagonal(mm.m * v * transpose(mm.m))).diag
## 4-element Vector{Float64}:
##  0.04214405136906626
##  0.03793142983653827
##  0.04609822326134779
##  0.04145067163863225

Let’s see how our new grid object has changed.

grid
## 4×4 DataFrame
##  Row │ task    stimulus  emmean       SE
##      │ String  String    Float64      Float64
## ─────┼──────────────────────────────────────────
##    1 │ naming  word      -0.352442    0.0421441
##    2 │ lexdec  word      -0.00902498  0.0379314
##    3 │ naming  nonword   -0.0401372   0.0460982
##    4 │ lexdec  nonword    0.0482486   0.0414507

How does this compare to emmeans in R?

emm <- emmeans(m, ~task*stimulus, lmer.df = "a")
emm
##  task   stimulus emmean    SE  df asymp.LCL asymp.UCL
##  naming word      -0.35 0.043 Inf     -0.44    -0.268
##  lexdec word      -0.01 0.039 Inf     -0.08     0.067
##  naming nonword   -0.04 0.047 Inf     -0.13     0.052
##  lexdec nonword    0.05 0.042 Inf     -0.03     0.131
##
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95

That’s pretty close! Minor deviations here are likely due to the different algorithms used in the underlying model.

# Pairwise tests between marginal means

## Making pairwise mean differences

We’ll skip right to our best, automated method for computing the pairwise tests between the estimated marginal means for each combination of our levels that we used in the previous posts.

We’ll first get the names for the unique combinations for each contrast from our grid. We’ll then join the values in each column together so we have the pairs of conditions.

combo_names = [join(join(i, "_")) for i=zip(grid[:, :task], grid[:, :stimulus])]
## 4-element Vector{String}:
##  "naming_word"
##  "lexdec_word"
##  "naming_nonword"
##  "lexdec_nonword"

Then we’ll list the get the unique combinations for each pair of conditions to get each unique pairwise contrast.

begin
contrast_names = collect(combinations(combo_names, 2))
contrast_names_final = [join(join(i, " - ")) for i = contrast_names]
end
## 6-element Vector{String}:
##  "naming_word - lexdec_word"
##  "naming_word - naming_nonword"
##  "naming_word - lexdec_nonword"
##  "lexdec_word - naming_nonword"
##  "lexdec_word - lexdec_nonword"
##  "naming_nonword - lexdec_nonword"

Once we have these unique pairwise contrast names we’ll get their indices so we can use these values to get the estimated marginal means of the pairwise contrasts. This just requires us to get every unique combination of the numbers 1 through to 4 (the maximum row length in our estimated marginal means given Julia is 1-indexed). We create a vector that holds vectors of two values to represent the pairwise contrasts.

column_combinations = collect(combinations(1:length(combo_names), 2))
## 6-element Vector{Vector{Int64}}:
##  [1, 2]
##  [1, 3]
##  [1, 4]
##  [2, 3]
##  [2, 4]
##  [3, 4]

We can then put these into a data frame with some placeholder names for the pairs of contrasts (i1 and i2). We now know which rows in our contrast matrix we have to subtract from one another.

begin
combo_cols = DataFrame(i1 = [], i2 = [])
for i in 1:length(column_combinations)
push!(combo_cols.i1, column_combinations[i][1])
push!(combo_cols.i2, column_combinations[i][2])
end
end

print(combo_cols)
## 6×2 DataFrame
##  Row │ i1   i2
##      │ Any  Any
## ─────┼──────────
##    1 │ 1    2
##    2 │ 1    3
##    3 │ 1    4
##    4 │ 2    3
##    5 │ 2    4
##    6 │ 3    4

To figure out which terms need to be switched on and off in our parameter estimates we can perform vectorised subtraction on the indices. We transpose this to turn the values back into a matrix which we can use for matrix multiplication with our parameter estimates.

model_pairwise_contrasts = Matrix(transpose(
mm.m[combo_cols[:, :i1], :] - mm.m[combo_cols[:, :i2], :]
))
## 4×6 Matrix{Float64}:
##   0.0   0.0   0.0   0.0   0.0   0.0
##  -1.0   0.0  -1.0   1.0   0.0  -1.0
##   0.0  -1.0  -1.0  -1.0  -1.0   0.0
##   0.0   0.0  -1.0   0.0  -1.0  -1.0

To generate the estimated marginal means for each pairwise contrast we will multiply each column of the individual pairwise contrasts from the matrix above by the betas.

model_pairwise_contrasts .* betas
## 4×6 Matrix{Float64}:
##  -0.0       -0.0       -0.0       -0.0       -0.0       -0.0
##  -0.343417   0.0       -0.343417   0.343417   0.0       -0.343417
##   0.0       -0.312304  -0.312304  -0.312304  -0.312304   0.0
##  -0.0       -0.0        0.255031  -0.0        0.255031   0.255031

Notice that for some contrasts this just means pulling out a single fixed effect estimate, while for others there are multiple fixed effects “switched on”. So our next step is to sum these together to get a single score for each contrast. We will put this together as a single vector.

pairwise_means = vec(sum(model_pairwise_contrasts .* betas, dims = 1))
## 6-element Vector{Float64}:
##  -0.34341675691748685
##  -0.312304497311043
##  -0.4006903185767784
##   0.03111225960644387
##  -0.05727356165929154
##  -0.08838582126573541

Now we’re ready to create a data frame of pairwise contrast names and estimated marginal means for the contrasts.

pairwise_tests = DataFrame(
:contrast => contrast_names_final,
:est => pairwise_means
)
## 6×2 DataFrame
##  Row │ contrast                         est
##      │ String                           Float64
## ─────┼─────────────────────────────────────────────
##    1 │ naming_word - lexdec_word        -0.343417
##    2 │ naming_word - naming_nonword     -0.312304
##    3 │ naming_word - lexdec_nonword     -0.40069
##    4 │ lexdec_word - naming_nonword      0.0311123
##    5 │ lexdec_word - lexdec_nonword     -0.0572736
##    6 │ naming_nonword - lexdec_nonword  -0.0883858

For getting the standard errors we’ll again use our best method for doing so in the previous posts. Below we’ll append the standard errors to our table under the heading “SE”. We do this by stepping through each contrast and calculating the standard errors for that contrast using matrix multiplication.

begin
pairwise_SE = Float64[]
for i in 1:nrow(pairwise_tests)
grad_g = model_pairwise_contrasts[:, i]
push!(pairwise_SE, sqrt(sum(grad_g .* v .*transpose(grad_g))))
end
pairwise_tests.SE = pairwise_SE
end;

show(pairwise_tests, allcols = true)
## 6×3 DataFrame
##  Row │ contrast                         est         SE
##      │ String                           Float64     Float64
## ─────┼────────────────────────────────────────────────────────
##    1 │ naming_word - lexdec_word        -0.343417   0.0568873
##    2 │ naming_word - naming_nonword     -0.312304   0.0223443
##    3 │ naming_word - lexdec_nonword     -0.40069    0.0591124
##    4 │ lexdec_word - naming_nonword      0.0311123  0.0596979
##    5 │ lexdec_word - lexdec_nonword     -0.0572736  0.020867
##    6 │ naming_nonword - lexdec_nonword  -0.0883858  0.0621645

Now let’s take a look at our updated data frame.

Next we can start calculating our test statistics and 95% confidence intervals. We make the z-scores by dividing each mean score by the standard error. Using these scores, we can calculate the p-value using the normal approximation, which for a two-tailed test is 2 $$\times$$ the probability of our z-scores on the negative scale. We then calculate the lower and upper confidence interval limits using the normal approximation. This involves taking our parameter estimate ± the standard error multiplied by the z-value that corresponds to the upper/lower tails for the normal distribution for the 2.5th and 97.5th percentiles.

begin
pairwise_tests.z = pairwise_tests.est./pairwise_tests.SE
pairwise_tests.p = 2*cdf.(Normal(), -abs.(pairwise_tests.z))
pairwise_tests.lci = pairwise_tests.est + pairwise_tests.SE * quantile(Normal(0.0, 1.0), 1-(1+0.95)/2)
pairwise_tests.uci = pairwise_tests.est - pairwise_tests.SE * quantile(Normal(0.0, 1.0), 1-(1+0.95)/2)
end
## 6-element Vector{Float64}:
##  -0.23191969030494863
##  -0.2685105454245266
##  -0.28483208704136254
##   0.1481179999457679
##  -0.0163750169166273
##   0.033454298695036314

We can then print out our final contrasts with parameter estimates, standard errors, confidence intervals, and test statistics.

select!(pairwise_tests, [:contrast, :est, :SE, :lci, :uci, :z, :p]);
show(pairwise_tests, allcols = true)
## 6×7 DataFrame
##  Row │ contrast                         est         SE         lci         uci         z           p
##      │ String                           Float64     Float64    Float64     Float64     Float64     Float64
## ─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────
##    1 │ naming_word - lexdec_word        -0.343417   0.0568873  -0.454914   -0.23192     -6.03679   1.57209e-9
##    2 │ naming_word - naming_nonword     -0.312304   0.0223443  -0.356098   -0.268511   -13.9769    2.1555e-44
##    3 │ naming_word - lexdec_nonword     -0.40069    0.0591124  -0.516549   -0.284832    -6.77844   1.21477e-11
##    4 │ lexdec_word - naming_nonword      0.0311123  0.0596979  -0.0858935   0.148118     0.521162  0.602254
##    5 │ lexdec_word - lexdec_nonword     -0.0572736  0.020867   -0.0981721  -0.016375    -2.7447    0.00605668
##    6 │ naming_nonword - lexdec_nonword  -0.0883858  0.0621645  -0.210226    0.0334543   -1.42181   0.155083

We can even round the numeric values in the columns by defining a function to do this for us.

function round_df(df::DataFrame, decimals::Int)
for col in eachcol(df)
if eltype(col) <: Number
col .= round.(col, digits=decimals)
end
end
return df
end
## round_df (generic function with 1 method)

df_rounded = round_df(pairwise_tests, 2);
show(df_rounded, allcols = true)
## 6×7 DataFrame
##  Row │ contrast                         est      SE       lci      uci      z        p
##      │ String                           Float64  Float64  Float64  Float64  Float64  Float64
## ─────┼───────────────────────────────────────────────────────────────────────────────────────
##    1 │ naming_word - lexdec_word          -0.34     0.06    -0.45    -0.23    -6.04     0.0
##    2 │ naming_word - naming_nonword       -0.31     0.02    -0.36    -0.27   -13.98     0.0
##    3 │ naming_word - lexdec_nonword       -0.4      0.06    -0.52    -0.28    -6.78     0.0
##    4 │ lexdec_word - naming_nonword        0.03     0.06    -0.09     0.15     0.52     0.6
##    5 │ lexdec_word - lexdec_nonword       -0.06     0.02    -0.1     -0.02    -2.74     0.01
##    6 │ naming_nonword - lexdec_nonword    -0.09     0.06    -0.21     0.03    -1.42     0.16

How does this match up with emmeans?

pairs(emm, adjust = "none")
##  contrast                        estimate    SE  df z.ratio p.value
##  naming word - lexdec word          -0.34 0.058 Inf  -5.900  <.0001
##  naming word - naming nonword       -0.31 0.023 Inf -13.700  <.0001
##  naming word - lexdec nonword       -0.40 0.060 Inf  -6.600  <.0001
##  lexdec word - naming nonword        0.03 0.061 Inf   0.500  0.6100
##  lexdec word - lexdec nonword       -0.06 0.021 Inf  -2.700  0.0100
##  naming nonword - lexdec nonword    -0.09 0.064 Inf  -1.400  0.1600
##
## Degrees-of-freedom method: asymptotic

That looks pretty good!

But this requires the user to make a grid of unique values in the variables they want to collapse across, respecify the formula for the fixed effects, use these to generate a model frame with the original model contrasts, and then define the model matrix. This is a lot of leg work and requires a lot of new inputs if we want to make a function.

As noted earlier, at the time when I initially wrote this code there wasn’t an alternative to doing all this nicely in a package in Julia. However, in the time between getting started on a package and this blog post, the Effects.jl package was released. This requires the user to simply define a design matrix and pass the model object to calculate the estimated marginal means. It can also allow you to calculate pairwise tests on these estimated marginal means. That said, it still lacks some of simplicity of the interface and functionality found in emmeans. Some time soon I’ll release a package that tries to recreate the interface emmeans (if not the full functionality) of emmeans by asking users to simply pass the model object and a formula, doing the grid construction for you, to calculate estimated marginal means and pairwise tests.