Contributions to Zoology, 86 (2) – 2017Roberto Guidetti; Sandra J. McInnes; Michele Cesari; Lorena Rebecchi; Omar Rota-Stabelli: Evolutionary scenarios for the origin of an Antarctic tardigrade species based on molecular clock analyses and biogeographic data

To refer to this article use this url: http://contributionstozoology.nl/vol86/nr02/a01

Material and methods

Literature data were used to map the known distribution of Mopsechiniscus species on current and past landmass maps to define their distribution patterns (Richters, 1907, 1920; Du Bois-Reymond Marcus, 1944; Ramazzotti, 1962; Mihelčič, 1967, 1971/72; Grigarick et al., 1983; Binda and Kristensen, 1986; Kristensen, 1987; Rossi and Claps, 1989; Ottesen and Meier, 1990; Rossi et al., 2009; Dastych and Moscal, 1992; Dastych, 1999a, b c, 2000, 2001; Kaczmarek et al., 2014; Guidetti et al., 2014; Roszkowska et al., 2016).

For molecular phylogenetics and dating analyses, we assembled a concatenated dataset of 18S (SSU, 686 bp corresponding to positions 1213-1899 of the Echiniscus canadensis 18S complete sequence, Genbank Acc. No. FJ435714) and 28S (LSU, 968 bp corresponding to positions 1194-2192 of the E. canadensis 28S complete sequence, Genbank Acc. No. FJ435784) rRNA, as they are the only gene fragments available for Mopsechiniscus species. Sequences from three M. franciscae specimens, one Mopsechiniscus granulosus Mihelčič, 1967 specimen, 46 other Heterotardigrada (belonging to 32 recognised species), and three Eutardigrada used as outgroups, were retrieved and downloaded from GenBank (accession numbers are provided in Fig. 2). Sequences of 18S and 28S were aligned individually with MUSCLE 3.8.31 (Edgar, 2004) using a default run followed by two refinement runs (using option –refine), and concatenated using an in house PERL script, then they were checked by visual inspection. This resulted in a dataset (main dataset) of 53 sequences belonging to 36 taxa and 1655 positions, and 51% of missing data. To account for alignment quality and the effect of missing data, we generated a second dataset with the same taxon sampling but customised to exclude various positional gaps, which are present in all Heterotardigrada (insertions in the distantly related Eutardigrada) or poorly represented along the alignment. This dataset (reduced-dataset) comprises 53 sequences belonging to 36 taxa and 824 positions, and 20% of missing data. Analyses were conducted in BEAST v1.8 (Drummond and Rambaut, 2007) using a homogenous GTR model of nucleotide replacement coupled with a gamma distribution with four discrete categories. Best fitting model evaluations were performed taking into account the Akaike information criterion and Bayes information criterion (jModelTest 0.0.1; Posada, 2008). All BEAST analyses were run twice for 10 million generations each, sampling every 1000 generations, and using a starting random tree. Convergence of the most relevant parameters was checked with Tracer v1.6 and a consensus divergence tree was calculated with TreeAnnotator v1.8.2 using a burning-in of 1000 sampled trees.

FIG2

Fig 2. a Molecular clock analyses of the Heterotardigrada. Antarctic and South American Mopsechiniscus species (in bold) diverged approximately 32.1 My ago in a period highly compatible with the origin of the Drake Passage. Nodes are positioned at the mean divergence date, while bars indicate the 95% credibility intervals. The three coloured bars in the split between M. franciscae and M. granulosus indicate the 95% credibility interval from three independent analysis using different sets of evolutionary and palaeobiological priors. Red/grey-dark bars correspond to a clock analysis using the rate deduced from Rota-Stabelli et al. (2013). Green/grey-bar corresponds to a clock analyses using the rate deduced from Rota-Stabelli et al. (2013) and a Log-normal distribution clock analysis (for full tree of this analysis see Fig. 3a). Yellow/grey-light bar corresponds to the analysis using secondary node constrains from Regier et al. (2004) (for full tree of this analysis see Fig. 3b). Values close to branches indicate posterior probability values (> 0.9) of BI analysis. b-e Distribution of the extant Mopsechiniscus species (spots) mapped on the distribution of lands ca. 160 Mya (b), ca. 120 Mya (c), ca. 60 Mya (d), ca. 20 Mya (e). Figures of ancient land distributions (b-e) modified from Lawver et al. (1992). na = not available.

Since time estimates are sensitive to tree and clock priors settings, we performed a model selection using Bayes factors based on the log marginal likelihoods, which were estimated using the smoothed harmonic mean method (Suchard et al., 2001) on 100 bootstrap replicates. Various combinations of priors were tested by running the main dataset with calibration set 1 (see below) in BEAST and varying one tree/clock prior at the time (Table 1). The strict, the relaxed random, the relaxed lognormal, and the relaxed exponential clocks were all tested as clock priors; the coalescent, the Birth Death, the incomplete Birth Death, and the Yule process were tested as tree priors. According to the test, positive values of 2lnBF difference from different runs indicated the better fit of a model combination over a previously ranking model combination. Significance of the Bayes factor was assessed in accordance with Kass and Raftery’s (1995) table.

FIG2

Table 1. Tree and clock prior selection using Calibration set 1. For each of the combinations of priors (Model 1 to 7), the mean estimates in millions of years for four nodes of interest are provided; the heights of the 95% Highest Posterior Density are in parenthesis. In bold the most fitting model using Calibration set 1.

With no fossil or sub-fossil records to constrain the Heterotardigrada, the clock was calibrated using two distinct calibration-sets based on replacement rates and posterior estimates for the split of the Eutardigrada derived from previous studies (Regier et al., 2004; Rota-Stabelli et al., 2013). For calibration-set-1, the clock was calibrated using the replacement rate of 0.001564 mutations per site per million years (mut/site/My), with standard deviation (SD) of 0.001012 mut/site/My, based on the analysis of 18S and 28S rDNA genes from a wide range of ecdysozoans (including the tardigrade clade), together with a permissive root prior of 579 Mya, SD 70 My (allowing sampling 95% of quantiles from 463 to 716 Mya) (Rota-Stabelli et al., 2013). An alternative calibration strategy (calibration-set 2) was employed using, as a root prior, a more restrictive distribution (compared to the first) centred on 659 Mya (SD 20 My; to allow 95% of sampling between 626 and 691 Mya) and by constraining two nodes within the eutardigrade outgroup (Macrobiotidae 144 Mya, SD 15 My; Eutardigrada 453 Mya, SD 12 My); this substantially different approach is based on the results of Regier et al. (2004), who build a posterior time estimate for some tardigrade clades. To check for the effect of the dataset on divergence times, we repeated the molecular clock analysis on the reduced-dataset using both calibration-set 1 and calibration-set 2.