Contact Samantha Price (saprice ‘at’ ucdavis ‘dot’ edu).

**Data & R code**

Download the data and R code: State dependent trait data and scripts

Download R code for simulating BM & OU models: BM&OUsimulations.R

Download powerpoint presentation: StateDependentDiversification2014

**Introduction**

This tutorial will combine various aspects of the phylogenetic comparative methods covered earlier in the week to show you how to test hypotheses of state-dependent diversification in continuous traits using flexible Ornstein-Uhlenbeck (OU) models. This method can be used to answer a wide variety of questions about how changes in ecology, morphology, behaviour or area can influence the evolution of continuous traits. For example, it has been used to answer such questions as “Does changing from terrestrial to arboreal foraging lead to changes in hind limb and tail morphology in pigeons and doves?” (Lapiedra et al. 2013), “Do changes in habitat use lead to changes in body size and shape in monitor lizards?” (Collar et al. 2013) and “Does the loss of heterostyly lead to changes in floral morphology in primroses?” (de Vos et al. 2014).

There are two main ways a change in ecology/morphology/behaviour/area can promote an evolutionary change in continuous traits:

1) Create the opportunity for further diversification through removal of previous constraints or allowing new adaptation.

2) Pull the population towards a new optimal phenotype.

Flexible OU models can distinguish between these two scenarios by estimating 3 different parameters: ‘theta’ (the optimal trait value), ‘alpha’ (the pull towards the optima) and ‘sigma’ (sigma^2 is the Brownian motion rate parameter). A change in trait X for an increment in time t = ‘alpha'[‘theta’-X(t)]*dt* + ‘sigma’*dB*(t). Under this model variance in trait X is determined by a combination of ‘alpha’, ‘sigma’ and time. Note when the pull towards the peak is 0 the model collapses to a Brownian motion model, under this circumstance the variance in trait X is determined by time and ‘sigma’, with the variance proportional to time (this creates phylogenetic patterning as closely related species will have more similar trait values). We can compare the fit of various models that allow the trait of interest to influence the rate, the optima and/or the strength of the pull towards the optima or to have no impact on the continuous trait (single rate or optima across all tips). Under scenario two, we would expect models that fit different optima to the different states to be the best-fit while under scenario one we would expect the variance in X to change between states caused by either changes in ‘alpha’ or ‘sigma’.

**Installation and Getting Started**

This tutorial mainly focuses on methods available in the R package OUwie (Beaulieu et al. 2012) but will also use functions in geiger and phytools.

**library(OUwie)**

**library(phytools)**

**Dataset and Rationale**

We will be testing two hypotheses using a dataset of marsupial body mass (from Jones et al. 2008) and diet (from Price et al. 2012) and the tree comes from Bininda-Emonds et al. 2007 (updated by Fitz et al. 2011). 1) American marsupials exhibit less variable body sizes than Australian mammals and 2) Diet (Herbivory, Carnivory, Omnivory) influences body size evolution – highest optimal mass in herbivores.

For the purposes of this tutorial I am going to assume that you have a folder called Bodega_2014 on your desktop and in it you have stored the dataset folder and tutorial you downloaded from the website, if you have stored it elsewhere/under a different name you will need to change the path.

## 1) Test hypothesis that American marsupials exhibit less variable body sizes than Australian mammals

Read in the tree and the trait data.

**mtr<-read.nexus(“~/Desktop/Bodega_2014/Datasets/marsupialtr1.nex”)**

**marsupialc<-read.table(“~/Desktop/Bodega_2014/Datasets/marsupialmass_continent.txt”, header=T) **# first column contains species names, second the discrete states and third the continuous trait

It is always a good idea to look at the data first and do some basic statistics to check everything is read in correctly and so you know what to expect from the more complex models. So we are going to plot histograms of the body mass data using different colours for the Australian (state ‘1’) and American species (state ‘2’)

**hist(marsupialc[,3][marsupialc[,2]==1], col=”blue”, add=T)** # Australia

**hist(marsupialc[,3][marsupialc[,2]==2], col=”red”, add=T)** # America

We can also estimate the variance

**austvar<-var(marsupialc[,3][marsupialc[,2]==1])**

**amvar<-var(marsupialc[,3][marsupialc[,2]==2])**

The distribution of the American species is much narrower than the Australian species, which would support our hypothesis that the American species have less variable body masses but what do the models suggest? Because our dataset is missing the South American microbiotheres the Australasian and South American species form monophyletic lineages, the clade we are identifying clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”) is the South American group. OUwie will automatically assign internal node labels and update the data matrix according to this clade designation.

Fitting models using clades

The simplest model is one where there is no difference between the continents and body mass just evolves according to a Brownian motion process (model=”BM1″ in OUwie).

**bm1sizecontinent<-OUwie(mtr, marsupialc, model=”BM1″, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))** # for completeness I have added the clade, root state and simmap tree but they will be ignored as this is just fitting a Brownian motion model across the whole tree.

The next most simple model is one where there is no difference between the continents and body mass is evolving to the same optimum across marsupials (model=”OU1″ in OUwie)

**ou1sizecontinent<-OUwie(mtr, marsupialc, model=”OU1″, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”)) **# for completeness I have added the clade and simmap tree but they will be ignored as this is just fitting a Brownian motion model across the whole tree.

What are we doing when we set root.station=TRUE? We are stating whether the starting state, theta_0, should be estimated and the default is TRUE, which means theta_0 is dropped from the model so the starting value is distributed according to the stationary distribution of the OU process. Dropping theta_0 from the model can stabilize estimates of the primary optima, especially if you find the estimates of theta in the full model do not make sense biologically. However, this assumption does not fit a biological scenario involving moving away from an ancestral state. A good idea is to run preliminary analyses using both TRUE and FALSE to see which works for your dataset, for this one TRUE is best.

Now we are going to start fitting models where continent influences body mass evolution. The first allows body size to evolve at different rates on the different continents (model=”BMS” in OUwie)

**bmssizecontinent<-OUwie(mtr, marsupialc, model=”BMS”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))**

The second allows body mass to evolve towards different peaks but the rate and pull towards the peak are the same (model=”OUM” in OUwie)

**oumsizecontinent<-OUwie(mtr, marsupialc, model=”OUM”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))**

The third allows body mass to evolve towards different peaks with different strengths of pull towards the peak but the rate is the same (model=”OUMA” in OUwie)

**oumasizecontinent<-OUwie(mtr, marsupialc, model=”OUMA”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))**

The fourth allows body mass to evolve towards different peaks with rates but the pull towards the peak is the same (model=”OUMV” in OUwie)

**oumvsizecontinent<-OUwie(mtr, marsupialc, model=”OUMV”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))**

The final model is the most complex and allows continent to influence all parameters (model=”OUMVA”)

**oumvasizecontinent<-OUwie(mtr, marsupialc, model=”OUMVA”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”))**

Analyzing results

First thing to do is to look at how OUwie outputs the data. To get a summary of the parameters and model-fit just type the name of the object you stored the results in e.g.

**bm1sizecontinent**

To extract information to calculate the delta AICc etc. we can look at the structure of the object

**str(bm1sizecontinent)**

We can see it is a list of 18 separate things, it includes the tree and the datset as well as the results. If we want to see how well each model fits the data we can calculate the delta AICc value by extracting the AICc. (We could of course use a likelihood ratio test to also test this etc.).

**bestfit<-c(bm1sizecontinent$AICc, ou1sizecontinent$AICc, bmssizecontinent$AICc, oumsizecontinent$AICc, oumasizecontinent$AICc, oumvsizecontinent$AICc, oumvasizecontinent$AICc)**

**names(bestfit)<-c(“bm1”, “ou1”, “bms”, “oum”, “ouma”, “oumv”,”oumva”)**

**bestfit-min(bestfit)**

Now look at the results for the best fitting model – are the rates faster in Australia (1) or America (2)?

**bmssizecontinent**

Checking the model has run properly

These models we are fitting are very complex and sometimes the information contained within the dataset are not sufficient and as a result one or more parameters may be estimated incorrectly. If you include the diagnostics in the OUwie run (diagn=T) it will include the eigendecomposition of the Hessian matrix, which can indicate whether the maximum likelihood estimate has been found. If the eigenvalues of the Hessian are positive then parameter estimates are considered reliable and the ML estimate has been found (i.e. the Hessian is positive definite).

Run the best-fitting bms model but this time include diagnostics and look at the eigenvalues

**bmssizecontinentdiag<-OUwie(mtr, marsupialc, model=”BMS”, simmap.tree=FALSE, root.station=TRUE, clade=c(“Caluromys_derbianus”, “Monodelphis_domestica”), diagn=T)**

**bmssizecontinentdiag$eigval**

The eigenvalues are positive. Now lets look at the standard errors of the rate estimate, if they are large it indicates that if you change the parameter value in any direction it will have little effect on the log-likelihood and the parameter estimate is considered unstable.

**bmssizecontinentdiag$solution.se**

## 2) Test hypothesis that diet influences body size evolution.

**marsupiald<-read.table(“~/Desktop/Bodega_2014/Datasets/marsupialmass_diet.txt”, header=T)**

Again we want to check the data before starting our analyses. First plot the distribution of diet on tips of the tree.

**colr<-rep(“green”, length(marsupiald[,1]))** # set up a vector of “green” for the length of the tip labels this will act as the diet =1

**colr[marsupiald[,2]==2]<-“purple”** # where diet = 2 replace “green” with purple

**colr[marsupiald[,2]==3]<-“blue”** # where diet = 3 replace “green” with blue

**names(colr)<-marsupiald[,1]** # name the colour vector with the name of each species

**colr<-colr[mtr$tip.label]** # sort the colours so they match the same order as the tip labels in the tree

Now plot the tree and replace the species names with coloured dots to represent diet

**plot(mtr, show.tip.label=F)**

**tiplabels(pch=16, col=colr)**

We now need to assign nodes or branches to a dietary category. There are various methods to reconstruct the history of a discrete character which were explained by Rich Glor earlier in the week. We will start with a simple maximum likelihood reconstruction allowing different transition rates between every state. We are going to use the function rerootingMethod in phytools which gives the marginal likelihood of the states at the node. *Remember the warning that ace in ape only gives the conditional rather than the marginal likelihoods for the character states at the node*.

**diet<-marsupiald[,2]**

**names(diet)<-marsupiald[,1]**

**dietrecon<-rerootingMethod(mtr, diet, model=”ARD”)**

The marginal likelihoods of each state are given in $marginal.anc, each row is a node of the tree. We can assign the state with the highest likelihood to each node, to do this we are going to create a new tree with node labels

**dietmtr<-mtr** # going to create a new tree that includes node labels

**dietmtr$node.label<-vector()** # set up an empty vector for the node labels

**for (i in 1:mtr$Nnode) dietmtr$node.label[i]<-names(dietrecon$marginal.anc[i,])[dietrecon$marginal.anc[i,]==max(dietrecon$marginal.anc[i,])]** # this for loop goes through every node (i) and fills the empty vector dietmtr$node.label with the ML state for each node. We are assigning the state with the highest likelihood to each node which is identified by max(dietrecon$lik.anc[i,]) we then find which trait it belongs to by finding the column name which corresponds to the maximum estimate.

Now lets plot the the nodes on the tree to make sure they make sense given the tips

**nodecolr<-rep(“green”, length=mtr$Nnode)**

**nodecolr[dietmtr$node.label==2]<-“purple”**

**nodecolr[dietmtr$node.label==3]<-“blue”**

**plot(mtr, show.tip.label=F)**

**tiplabels(pch=16, col=colr)**

**nodelabels(pch=16, col=nodecolr**

Fitting models using nodes assigned to states

Now run OUwie – we could do what we did last time or save some typing by writing a simple for loop. You will notice this time we are not going to run the OUMVA as this model is generally very difficult to fit to this data and it may not run at all.

**sizedietresults<-list() **# setting up an empty list for the results

**mods<-c(“BM1”, “OU1”, “BMS”, “OUM”, “OUMA”, “OUMV”)** # We set up a vector containing the names of the models we want to run and use this in a simple loop.

**for(i in 1:length(mods)){
sizedietresults[[i]]<-OUwie(dietmtr, marsupiald, model=mods[i], simmap.tree=FALSE, root.station=TRUE)
}**

To look at the AICc, the results are in a list so you need to extract the AICc from all 6 list elements (one for each analysis), to do this you use [[]]

**dietbestfit<-c(sizedietresults[[1]]$AICc, sizedietresults[[2]]$AICc, sizedietresults[[3]]$AICc, sizedietresults[[4]]$AICc, sizedietresults[[5]]$AICc, sizedietresults[[6]]$AICc)**

**names(dietbestfit)<-mods **# results are in the same order in which we ran the models

**dietbestfit-min(dietbestfit)**

All three models that fit separate optima for each diet category fit equally well, lets look at the estimated optima, to see which diet has the largest mass – 1, 2, or 3.

**bestfitthetas<-cbind(sizedietresults[[4]]$theta, sizedietresults[[5]]$theta, sizedietresults[[6]]$theta)**

**colnames(bestfitthetas)<-c(“OUMtheta”, “OUMthetaSE”, “OUMAtheta”, “OUMAthetaSE”, “OUMVtheta”, “OUMVthetaSE”) **# $theta contains both the estimated theta and the Standard Error so we are going to have 6 columns when we extract the result from the 3 multi-optima OU models.

Fitting models using stochastically mapped trees

The previous analyses re dependent on the tree topology, branch lengths and ancestral state reconstruction, which we know are not estimated without error. Instead we could use a Bayesian method such as stochastic character mapping to generate a set of dietary histories on a sample of phylogenies from the posterior distribution generated in BEAST or MrBayes. We can then run the analyses across this set of maps and summarize the results – although this may be a conservative estimate of the variability (Revell 2013).

Read in 11 stochastically mapped trees that I generated in SIMMAP v1.5 (with a uninformative prior on the transition matrix and using a branch length prior). Obviously you would want to use many more trees than this in a real analysis but we don’t have the time! We can tell OUwie we have stochastically mapped trees by using simmap.tree=T

**filenam<-list.files(pattern=”simmapmarsupialtrees”, include.dirs=FALSE)** # I have each tree stored in an individual file labeled as simmapmarsupialtrees and a unique number, this just finds all the files with the given pattern in the name and puts them in a list.

**marsupialtrs<-lapply(filenam, read.simmap, format=”phylip”)** # this takes the list of file names and reads each of them in using the read.simmap function.

Look at some of the trees to make sure everything looks as we would expect

**par(mfrow=c(2,2))**

**plotSimmap(marsupialtrs[[1]], fsize=0)**

**plotSimmap(marsupialtrs[[3]], fsize=0)**

**plotSimmap(marsupialtrs[[6]], fsize=0)**

**plotSimmap(marsupialtrs[[8]], fsize=0)**

We now want to run all six models over the set of trees. We create an output table (resmat) of all the results we are interested in for each tree, run the set of six models and fill out the table, save the table to the results list (marsupialresults) and then move to the next tree. So the outer loop is the one that iterates over the trees and the inner loop goes over the models. Unfortunately, it is not quite as simple as that as we actually have to loop over the Brownian and OU models separately due to difficult thetas.

It is very important to be careful when looping over trees as the structure of the output may change depending on the tree. For example, OUwie and Brownie.lite do not automatically sort their results to give the output for each state 1, 2, 3 instead it is determined by the stochastic character maps, which is variable. This is why we sort the results using the order function.

**mods<-c(“BM1”, “BMS”,”OU1″, “OUM”, “OUMA”, “OUMV”) **

**marsupialresults<-list()**

**for(i in 1:length(marsupialtrs)){**

** print(i) **#helps to know which tree it is working on

** resmat<-matrix(nrow=21, ncol=6) # set up an empty results matrix for all the parameter estimates you are interested in, for each tree a new matrix will be created**

** colnames(resmat)<-c(“BM1”, “BMS”, “OU1”, “OUM”, “OUMA”, “OUMV”)**

** row.names(resmat)<-c(“loglik”, “AICc”, “Herb_alph”, “Herb_alphaSE”,”Herb_sigma”, “Herb_sigmaSE”, “Omn_alph”, “Omn_alphaSE”,”Omn_sigma”, “Omn_sigmaSE”, “Carn_alph”, “Carn_alphaSE”,”Carn_sigma”, “Carn_sigmaSE”, “Herb_theta”, “Herb_thetaSE”, “Omn_theta”, “Omn_thetaSE”, “Carn_theta”, “Carn_thetaSE”, “MLStatus”)**

** for(j in 3:6){ **# running OU models first

** **

** tmpres<-OUwie(marsupialtrs[[i]], marsupiald, model=mods[j], simmap.tree=TRUE, root.station=TRUE, diagn=T**) # we want the diagnostics on, as you want to be able to evaluate the ML estimate. We also want to indicate that this time we are using stochastically mapped trees with simmap.tree=T

**ev<-vector() # **

** for(k in 1:length(tmpres$eigval)) if (tmpres$eigval[k]>0) ev<-c(ev,0) else ev<-c(ev,1)** # this is a looped if statement, it evaluates simple logical statements to see if any of the eigenvalues are negative, if not a 0 is added to the vector ev (if (tmpres$eigval[k]>0) ev<-c(ev,0))but if it is negative a 1 is added (else ev<-c(ev,1)).

** eval<-vector()**

** if(sum(ev)==0) eval<-“All eigenvalues positive” else eval<-“Some eigenvalues negative”** # if there are any eigenvalues that are negative the sum of the ev vector will be greater than 1.

#We need to sort the results as mapped.edge determines the order, this may be problematic when we are summarizing across different maps as the the results are not always given in the order 1,2,3 but the sorting fixes this issue.

** row.names(tmpres$theta)<-colnames(tmpres$solution)**

** tmpres$theta<-tmpres$theta[order(row.names(tmpres$theta)),]**

** tmpres$solution<-tmpres$solution[,order(colnames(tmpres$solution))] **

** tmpres$solution.se<-tmpres$solution.se[,order(colnames(tmpres$solution.se))]**

** **

** resmat[,j]<-c(tmpres$loglik, tmpres$AICc, tmpres$solution[1], tmpres$solution.se[1], tmpres$solution[2], tmpres$solution.se[2], tmpres$solution[3], tmpres$solution.se[3], tmpres$solution[4], tmpres$solution.se[4], tmpres$solution[5], tmpres$solution.se[5], tmpres$solution[6], tmpres$solution.se[6], tmpres$theta[1,], tmpres$theta[2,], tmpres$theta[3,], eval)**

** **

** }**

**for(l in 1:2){** # running BM models second – we don’t want any thetas in our results, OUwie gives output in the theta column which are the phylogenetic means but as there is not pull towards the mean they cannot really be considered ‘optima’

** tmpres<-OUwie(marsupialtrs[[i]], marsupiald, model=mods[l], simmap.tree=TRUE, root.station=TRUE, diagn=T)** # we want the diagnostics on, as you want to be able to evaluate the ML estimate. We also want to indicate that this time we are using stochastically mapped trees with simmap.tree=T

**ev<-vector()**

** for(k in 1:length(tmpres$eigval)) if (tmpres$eigval[k]>0) ev<-c(ev,0) else ev<-c(ev,1)** # this goes through and evaluates whether any of the eigenvalues are negative, if is not a 0 is added to the vector ev but if it is negative a 1 is added

**eval<-vector()**

** if(sum(ev)==0) eval<-“All eigenvalues positive” else eval<-“Some eigenvalues negative”** # if there are any eigenvalues that are negative the sum of the ev vector will be greater than 1.

#We need to sort the results as mapped.edge determines the order, this may be problematic when we are summarizing across different maps as the the results are not always given in the order 1,2,3 but the sorting fixes this issue.

** tmpres$solution<-tmpres$solution[,order(colnames(tmpres$solution))] **

** tmpres$solution.se<-tmpres$solution.se[,order(colnames(tmpres$solution.se))]**

** **

** resmat[,l]<-c(tmpres$loglik, tmpres$AICc, tmpres$solution[1], tmpres$solution.se[1], tmpres$solution[2], tmpres$solution.se[2], tmpres$solution[3], tmpres$solution.se[3], tmpres$solution[4], tmpres$solution.se[4], tmpres$solution[5], tmpres$solution.se[5], tmpres$solution[6], tmpres$solution.se[6],0,0,0,0,0,0, eval)**# 0’s are for the thetas

** **

** }**

** **

**marsupialresults[[i]]<-resmat**

**}**

**save(marsupialresults, file=”marsupialresults.Rdata”)**

Ok that was a pretty ugly set of loops but hopefully fairly intuitive, you could make this code prettier by using the various apply statements available in R.

First make sure all the ML estimates are okay according to the eigenvalues – you want to remove the results of any that say “Some eigenvalues negative” – we shouldn’t have any of these.

**errors<-matrix(nrow=length(marsupialtrs), ncol=ncol(marsupialresults[[1]]))**

**colnames(errors)<-colnames(marsupialresults[[1]])**

**for(i in 1:length(marsupialtrs)) errors[i,]<-marsupialresults[[1]][21,]**

Now calculate the delta AICc for the different models across all trees

**AICctable<-matrix(nrow=length(marsupialtrs), ncol=ncol(marsupialresults[[1]]))**

**colnames(AICctable)<-colnames(marsupialresults[[1]])**

**for(i in 1:length(marsupialtrs)) AICctable[i,]<-as.numeric(marsupialresults[[i]][2,])-as.numeric(min(marsupialresults[[i]][2,]))**

**colMeans(AICctable)** # you can see the best-fitting model is not the one found with the maximum likelihood reconstructions at the node

As the best-fitting model is OUM (which fits different optima to each dietary category but with a single rate and alpha) for simplicity lets look at the theta estimates for the OUM. Although a better option might be to calculate the model-averaged thetas, whereby you sum the thetas across all models weighted by the Akaike Weight for each model.

**herbivoreoptima<-vector()**

**omnivoreoptima<-vector()**

**carnivoreoptima<-vector()**

**for(i in 1:length(marsupialresults)){** # creating vectors that include the estimates for all 11 trees

** herbivoreoptima<-c(herbivoreoptima,as.numeric(marsupialresults[[i]][15,4]))** # Herbivore optima are in row 15 and OUM is column 4 and we also need to tell it is a number (as.numeric)

**omnivoreoptima<-c(omnivoreoptima,as.numeric(marsupialresults[[i]][17,4]))** # Omnivore optima are in row 17 and OUM is column 4 and we also need to tell it is a number (as.numeric)

** carnivoreoptima<-c(carnivoreoptima,as.numeric(marsupialresults[[i]][19,4]))** # Carnivore optima are in row 19 and OUM is column 4 and we also need to tell it is a number (as.numeric)

**}**

Now plot the optima – which one is heaviest? Does it support our hypothesis?

**hist(herbivoreoptima, xlim=c(0,12), ylim=c(0, 8), col=”green”, main=”Optimal Mass from Best-fitting model”, xlab=”Body mass optima (ln mass (g))”)**

**hist(carnivoreoptima, xlim=c(0,12), col=”blue”, add=T)**

**hist(omnivoreoptima, xlim=c(0,12), col=”purple”, add=T, breaks=3)**

## Caveats and things to think about

**1) Do you have the power? **You need to be sure the information within your dataset is sufficient to distinguish between the models and estimate parameters. Do you have enough species in your tree and within each state? We have seen how we can use the diagnostics to do some of this but it is not perfect. So you should consider using the estimated parameters to run simulations to test the power – can you estimate the correct parameter estimates, can you distinguish it from a more simple or complex model? (forward simulation/parametricboostrapping see Boettiger et al. 2012 or posterior prediction see Jeremy Brown’s presentation earlier this week). OUwie has an inbuilt simulation function that allows you do this relatively simply.

**2) High performance computing.** Note that is analysis may take a while if you have a 1000 trees with lots of taxa so you may want to consider running it on a machine that is not your laptop. If you have access to multiple CPU’s you could consider splitting the analysis into several portions to run on different CPU’s (batch processing) and then concatenating the output. To convert this script to submit to a cluster/server you need to make sure all the important data is written to a file and that you plot to a pdf or other file type (don’t forget to close it too – the default is a pdf called Rplots.pdf which is what you will generate if you run this script as is – you will notice you would probably want to make the size of the plotting area larger etc.). You also should ensure that you have the the right path to your files. Finally, you need to add quit() at the end of your script. To run R non-interactively (on a cluster) you need to use the command R CMD BATCH input-filename (i.e this script) output-filename(optional).

Try running this script, available to download at the top of the page as Morph_rateinR2012.R non-interactively. On a mac you should be able to open the Terminal window and navigate to the folder where you have saved the script (using cd) and then type:

**R CMD BATCH OUwietutorial.R**

You should generate a default output file name OUwietutorial.Rout this will show you exactly the commands that have been executed and can be used to figure out where in the analysis it currently is.

## Identifying shifts in rate or peaks without *a priori* hypotheses

Over the few years several different methods have been published that allow the identification of shifts in the rate of phenotypic evolution without *a priori* hypotheses: Revell et al., 2012 (phytools package in R); Eastman et al. 2011 (geiger package in R), Venditti et al. 2011 and Thomas & Freckleton 2011 (MotMot package in R) or peaks: Ingram & Mahler 2013 (surface package in R).

## References

Bininida-Emonds, O.R.P., Cardillo, M., Jones, K.E., MacPhee, R.D.E., Beck, R.M.D., Grenyer, R., Price, S.A., Vos, R., Gittleman, J.L., Purvis, A. The delayed rise of present-day mammals. *Nature* 446, 507-512.

Beaulieu, J.M., Jhwueng, D-C, Boettiger, C. & O’Meara, B.C. (2012) Modelling stabilizing selection: expanding the Ornstein-Uhlenbeck model of adaptive evolution. Evolution 66(8), 2369-83.

Collar, D. C., Schulte, J. A. II & Losos, J. B. (2011) Evolution of extreme body size disparity in monitor lizards (Varanus). *Evolution* 65(9), 2664-2680.

de Vos, J., Wuest, R. O & Conti, E.(2014) Small and ugly? Phylogenetic analyses of the “Selfing syndrome” reveal complex evolutionary fates of monomorphic primrose flowers. *Evolution* online early.

Fritz, Susanne A, Bininda-Emonds, Olaf R P & Purvis, Andy (2009) Geographical variation in predictors of mammalian extinction risk: big is bad, but only in the tropics. *Ecology Letters*, 12, 538-549.

Ingram, T. & Mahler, D. L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise Akaike Information Criterion. *Methods in Ecology and Evolution *4(5), 416-425.

Jones, K.E., Bielby, J., Cardillo, M., Fritz, S.A., O’Dell, J., Orme, C.D.L., Safi, K., Sechrest, W., Boakes, E.H., Carbone, C., Conolly, C., Cutts, M.J., Foster, J.K., Grenyer, R., Habib, M., Plaster, C.A., Price, S. A., Rigby, E.A., Rist, J., Teacher, A., Bininda-Emonds, O.R.P., Gittleman, J.L., Mace, G.M. & Purvis, A. (2009) “PanTHERIA: A species-level database of life-history, ecology and geography of extant and recently extinct mammals”. *Ecology* 90(9), 2648.

Lapiedra, O., Sol, D., Carranza, S. & Beaulieu, J. M. (2013) Behavioural changes and the adaptive diversification of pigeons and doves. *Proceedings of the Royal Society of London B* *280*(1755), 20122893..

Price, S. A., Hopkins, S. C., Smith, K. K & Roth, V. L. (2012) Tempo of trophic evolution and its impact on mammalian diversification. *Proceedings of the National Academy of Sciences 109(18), 7008-7012*

Revell, L. J., Mahler, D. L., Peres-Neto,P. R. & Redelings, B. D. (2012) A new phylogenetic method for identifying exceptional phenotypic diversification. *Evolution* 66 (1), 135-146.

Revell. L. J. (2013) . A Comment on the Use of Stochastic Character Maps to Estimate Evolutionary Rate Variation in a Continuously Valued Trait. *Systematic Biology*, 62(2), 339-345.

Thomas, G. & Freckleton, R. P. (2012) MOTMOT: models of trait macroevolution on trees. *Methods in Ecology and Evolution* 3(1), 145-151

Venditti,C., Meade, A. & Pagel, M. (2011) Multiple routes to mammalian diversity. *Nature* 479 (393),