The goal of the analysis was to determine, for each clade, which model of character change best described the evolution of speciesspecific body size. To do this we calculated the goodness of fit of each model to the body sizes of the taxa at the tips of the tree. Goodness of fit was measured as the logarithm of the likelihood of each model given the data (body sizes) under the Brownian motion process of character change. Better fits between the data and a model for bodysize evolution will have higher log(likelihood) scores.
The fits of alternative models were compared using the differences in their corresponding log(likelihoods). The first three models tested have the same number of estimated parameters. For the gradual model, there is only one rate. For the speciational and pitchfork models, we assume that the traits evolve at different rates on different branches (e.g. at rate = 0 for the zero length branches of the pitchfork model and at different rates for the unit length branches). However, because we set the branch lengths a priori (ignoring the actual inferred time between nodes), only one rate parameter is estimated. The other rates are proportional to it: we could estimate the actual rate for any branch on the tree by dividing the single estimated rate by the branch length taken from the gradual tree. This rate is uninteresting, however: for the speciational model, it would be the average of the fast rate at speciation and zero for the duration of that lineage’s existence. Given that these three models are equally parameterized, any two are deemed significantly different when the difference in their log(likelihoods) is 2.0, corresponding to approximately a sevenfold difference in their likelihoods (Edwards, 1992: 180ff). Risch (1992) would prefer a difference of 3.0 before discriminating among models (corresponding to a 20-fold difference in likelihoods).
The fourth model is qualitatively different. The species’ body sizes are used to optimize the branch lengths, which are unconstrained by a priori hypotheses, and so the model has many more parameters than the first three. The number of extra parameters is 2N4 for a tree having N species (the number of branches in an unrooted tree –1). This allows us to judge whether the free model is significantly better than any of the alternatives by comparing twice the improvement in the log(likelihood) to a chisquare distribution with 2N4 degrees of freedom (see Goldman, 1993).
Given that the actual topology and number of tips affects the fit of any given data set to a tree, we cannot compare the values of the same model for different clades; for example we cannot state that the gradual model for one group of birds fits n times better than it does for some other group (Goldman, 1993). However, individual trees are independent and the log(likelihoods) for a given model may be summed across the full data set of twentyone trees to yield an overall measure of the goodness of fit of each model. Because the differences between the fits of different models are relevant, but not their absolute value, we scaled the log(likelihoods) so that the worstfit model for any data set was designated zero.
All analyses were performed on a modified version of the CONTML program of PHYLIP (version 3.0, Felsenstein, 1993). The modification allowed CONTML to return a log(likelihood) for a data set that was independent of the scale of measurement of body size and total tree length. The modification is available from the authors upon request. The Brownian motion model does not require directionality (i.e., rootedness) in the tree, and in order to use CONTML, the rooted trees in Fig. 1 were made unrooted before analysis. This does not bias the results in favour of any specific model.