Reading notes of Stroup, Walter W., “Rethinking the Analysis of Non-Normal Data in Plant and Soil Science”, Agronomy Journal 107, 2 (2015), pp. 811.

Some history: Fisher and Mackenzie (1923) published the first ANOVA results. Nelder and Wedderburn (1972) introduced generalized linear models, a major departure in approaching non-normal data. Breslow and Clayton (1993) and Wolfinger and O’Connell (1993) integrated mixed models and generalized linear mode theory and methods. The following two decades saw intense development of GLMM theory and methods.

ANOVA rests on *three* assumptions: independent observations (vs correlated observations), normally distributed data (vs non-normal data), and homogeneous variance (vs heterogeneous variance). However, non-normal data are common in most cases, e.g. count (*Poission* or *Negative binomial*), time of flowing (*Exponential* or *Gamma*), continuous proportion such as leaf area affected (*Beta*), quadrats observed out of *n* quadrats (*Binomial*). For all non-normal distributions, their variance depend on the mean. Thus, if data are non-normal, chances are their variance are not homogeneous. Traditionally, the Central Limit Theorem assures that sampling distribution of means will approximately normal if sample size is large enough. Standard *variance-stabilizing* transformations are used to deal with heterogeneous variances, e.g. `log(count + 1)`

, `sqrt(small_count + 3/8)`

, `count^(2/3)`

, `asin(sqrt(proportion))`

. GLMMs extended the linear model theory to accommodate data the may be non-normal, have heterogeneous variance, and be correlated. On the GLMMs point of view, ANOVA is antiquated or even obsolete.

Stroup (2015) showed that ANOVA with untransformed and log-/sqrt-transformed count data and GLMM all control Type I error adequately, but GLMMs have more power to detect treatment differences; for discrete proportion data, untransformed ANOVA yields estimates of the marginal `\(p_i\)`

but not the correct standard errors, the GLMM yields estimates of the conditional `\(p_i\)`

and correct standard errors, the arc sine transformed ANOVA does not provide estimates of either.

Take a binomial example: the *i*th treatment in the *j*the block with `\(N_{ij}\)`

yes-no observations and probability `\(p_{ij}\)`

of a yes response on any given *ij*th observation unit. Three distributions relevant to the analysis of these experimental data.

- The distribution of block effects (random effects). Blocking is a design strategy to ensure that units within blocks are as similar as possible. Variability among blocks are expected and we assume the blocks are representative of blocks we could have used. Thus variation among blocks is assumed to be a normal distribution:
`\(b_j\sim NI(0,\sigma_{B}^{2})\)`

(normal and independently). - The distribution at the unit level: observations in the
*ij*unit ~`\(Binomial(N,p_{ij})\)`

. This distribution conditional on the random effects.`\(y_{ij}|b_j\sim Binomial(N,p_{ij})\)`

: the distribution of the observations, conditional on the observation being in the*j*th block, is binomial distributed (with N and`\(p_{ij}\)`

). - The actually observed distribution: the marginal distribution. When we say we have binomial data, we are referring to the distribution of the observations conditional on the
*ij*th unit. The distribution of observed data–the marginal distribution–is most likely not binomial distributed.

*The first two distributions, we cannot observed directly. The only distribution we observed is the third one.* This is not an issue if the first two are normal distributions as the third will also be normal. For all other non-normal data, the marginal distribution of the observed data is quite different. Our usual intuitions can betray and mislead. The **fundamental** problem of analyzing non-normal data is that what we want to estimate or test (in this example, the treatment effects on `\(p_{ij}\)`

of binomial data) involves parameters of distributions that we *cannot* directly observed. In another word, the information we want are camouflaged in a complex observed marginal distribution. GLMMs can extract the information we want from the observations we have but not ANOVA and regression.

The GLMM conditional estimate asks: “if I take an average number of the population, which means a member of the population whose block effect `\(b_j=0\)`

, what is the estimated binomial probability?” (think about median value). The marginal estimate asks: “if I average across all the members of the population, what is the mean proportion?” (think about mean value). Which one to use depends on your questions.

Stroup (2015) argues for binomial data, ANOVA with or without transformation should be considered unacceptable for publication. If the *marginal mean* best address the research objectives, the correct approach requires an alternative formulation of the GLMM, that is generalized estimating equations (GEEs, Zeger et al. 1988). GEE replaces random effects in the linear predictor with working variance and correlation and replaces the distribution with a quasi-likelihood. Assuming equal N for all experimental units, the beta GLMM is the preferred method if the marginal mean is the appropriate target. For unequal N, use the GEE.

In sum, Stroup’s (2015) main take-home message: for non-normal data, ANOVA, with or without transformed data, won’t work. The loss of accuracy and power are too great. GLMMs and, in some cases, GEEs are the methods of choice.