# Dealing with ‘the spectre of "spurious" correlations': hazards in comparing ratios and other derived variables with a randomization test to determine if a biological interpretation is justified

Williams, M.R., Lamont, B.B. and He, T.
(2021)
Oikos
.
Early View.

## Abstract

We note the continuing widespread use of regressions of mathematically dependent (derived or confounded) variables [e.g. comparisons of standardized ratios: X/Y versus Z/Y, or the part versus the whole: X versus (X + Y)] in all disciplines of biology and ecology. These may lead to ‘spurious' correlations as even random numbers would produce similarly statistically significant results. We developed a randomization test to determine the probability of obtaining the observed correlation coefficient by chance alone. Only if the regression remains statistically significant after the results of the randomization test are taken into account (random coefficient subtracted from the observed coefficient) is any sort of biological interpretation justified. We demonstrate that the often compared expressions, ln[(Y + X)/X] (e.g. relative growth rate) versus lnX (e.g. original mass), are negatively and significantly correlated whatever values of X and Y are used; thus, conclusions from such comparisons that seedlings from smaller seeds grow faster than from larger seeds are spurious. Derived variables are only likely to be meaningfully correlated if X and Y are correlated from the outset and the researcher can then decide if the ‘actual' derived relationship is worth reporting.

