As a mathematical modeller in evolutionary biology, my seminar bingo card has four prime boxes. Watching a talk about evolution, I count down the minutes to the first appearance of Dobzhansky’s “nothing in biology” quote (or some variant thereof) or a picture of Darwin’s “I think” sketch. For mathematical modelling, it’ll be either Albert Einstein or George Box:

“All models are wrong but some are useful” – George Box

“Everything should be made as simple as possible, but not simpler” – probably not Albert Einstein

Of course, such quotes are popular for good reason, and I’m not criticising those who use them to good effect, but all the same it can be fun to try to find a new way of presenting familiar material. That’s why in spring 2015 I came up with and tweeted a visual summary of the latter two aphorisms, which I named the Box-Einstein surface of mathematical models:

The grey region in the plot ensures that all possible models have some degree of “wrongness”, but the contours in the remaining region tell us that some models are useful all the same. To find the most useful description of a particular phenomenon, we must reduce complexity without overly increasing wrongness.

A key thing to understand about this diagram is that although the boundary of the grey region is invariant, the surface is changeable. If our empirical knowledge of the system becomes richer, or if we change the scope of our enquiry, the most useful model may be more or less complex than before.

Einstein’s quote can be seen as simply paraphrasing Occam’s razor, but I think it has additional meaning with regard to what Artem Kaznatcheev calls heuristic and abstract mathematical models, such as are generally used in biology. In statistics, a simple model has few degrees of freedom, which is desirable to reduce overfitting. However, statisticians should also beware what JP Simmons and colleagues termed “researcher degrees of freedom”:

“In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both?

“It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.” The problem, of course, is that the likelihood of at least one (of many) analyses producing a falsely positive finding at the 5% level is necessarily greater than 5%.”

Likewise, when a researcher makes a mathematical model of a dynamical system – be it a set of differential equations or a stochastic agent-based model – he or she makes numerous decisions, usually with more or less full knowledge of the empirical data against which the model will be judged.

But there’s an important difference between the process of collecting data and that of creating a mathematical model. Ideally, the experimentalist can minimise researcher degrees of freedom by following a suitable experimental design and running controls that enable him or her to test a hypothesis against a null according to a predetermined statistical model. For most mathematical models there is no such template, and a process of trial and improvement is unavoidable, forgivable, and even desirable (inasmuch as it strengthens understanding of *why* the model works). The role of mathematical modeller is somewhere between experimentalist and pure mathematician. By making our models as simple as possible, we shift ourselves further toward the latter role, and our experimentation becomes less about exploiting our freedom and more about honing our argument.

For further reading, check out Artem Kaznatcheev’s insightful post about what “wrong” might mean, and why Box’s quote doesn’t necessarily apply to all types of model.