# ===================How to simulate Brownian motion=================== # What is Brownian motion? At each instant in time the state of a character can increase or decrease and the direction and magnitude of these changes are independent of current or past states (no trends etc.). The total displacement of a character after time t is drawn from a normal distribtion with a mean of zero (i.e. net change in trait is 0) and a variance proportional to t. # First generate the displacements of the trait using the rnorm function to generate a normal distribution with a mean of zero for 100 units of time. displace<-rnorm(100) plot(displace, type="l") ## The sum of these displacements represents the trajectory of the trait through time x<-cumsum(displace) plot(x, type="l", xlab="Time", ylab="Trait Value") ## Now create a plot with 5 separate runs of Brownian motion. You can imagine these as 5 independent species (sharing no common ancestry). plot(cumsum(rnorm(100)), type="l", ylim=c(-30,30)) lines(cumsum(rnorm(100)), col="red") lines(cumsum(rnorm(100)), col="blue") lines(cumsum(rnorm(100)), col="green") lines(cumsum(rnorm(100)), col="purple") # Here we can see variance in the trait is proportional to time # The standard deviation of the normal distribution is related to the Brownian motion rate parameter. Thus, if your rate is faster you generate more variance aka disparity. Examine this by running a simulation that first runs with sd=1 (the default) for 500 generations and then sd=5 for 100 generations and look at the variance plot(cumsum(rnorm(100)), type="l", ylim=c(-80,80), ylab="Trait") lines(cumsum(rnorm(100))) lines(cumsum(rnorm(100))) lines(cumsum(rnorm(100))) lines(cumsum(rnorm(100))) lines(cumsum(rnorm(100, sd=5)), col="green") lines(cumsum(rnorm(100, sd=5)), col="green") lines(cumsum(rnorm(100, sd=5)), col="green") lines(cumsum(rnorm(100, sd=5)), col="green") lines(cumsum(rnorm(100, sd=5)), col="green") # Obviously the above simulations were assuming the species were independent but we know that they are not, they share evolutionary history and this affects the evolution of their traits. Imagine two species which shared a common ancestor 1 million years ago and the common ancestor with their next closest relative was 5 million years ago. The 2 species share 4 million years of common evolutionary history following exactly the same BM trajectory but 1 million years ago they split and followed independent BM trajectories (with the same mean and rate), this means they have only had 1 million years to accumulate any trait differences. Compare this scenario with two species that last shared a common ancestor 20 million years ago, they have 20 million years following independent BM trajectories, so the trait of interest is likely to take very different values within these two species as more time has elapsed allowing for more changes to accrue. The variance is equal to sigma^2*dt which means the SD of the normal distribution is the square-root of the rate*time (time is given by the branch length). Fortunately, you don't need to write this simulation yourself as there are many existing functions in R e.g. fastBM in phytools # ===================How to simulate Ornstein-Uhlenbeck=================== # For an OU model a change in trait X over an increment in time t (dXi(t))= alpha[theta-Xi(t)]dt + sigma*dBi(t). So to simulate this we need 4 parameters, theta (the optimal trait value), alpha (the pull towards the optima), sigma (sigma^2 the stochastic (Brownian) motion parameter) and the starting value for the trait(x0). To keep things simple we are assuming that the species share no common ancestry so dt is ignored (it would also appear in the BM portion where variance of the normal distribution is given as sigma^2*dt). # So if we start the trait at the optima which is 1 alpha<-0.4 theta<-1 sigma<-0.05 x0<-1 OUalpha0.1_sample1<- x0 + alpha*(theta-x0)+sigma*(rnorm(1,mean=0)) # the next sample would be OUalpha0.1_sample2<- OU_sample1 + alpha*(theta-OU_sample1)+sigma*(rnorm(1,mean=0)) OUalpha0.1_sample3<- OU_sample2 + alpha*(theta-OU_sample2)+sigma*(rnorm(1,mean=0)) plot(x=0:3, y=c(1,OUalpha0.1_sample1, OUalpha0.1_sample2, OUalpha0.1_sample3), type="l", main="Alpha=0.4, Theta=1, Sigma=0.05, x0=1") # It will take a while to keep writing these out so we can see the actual pattern, so lets create a simple function to do it for us, it will run the simulation for n time units ornstein_uhlenbecksim <- function(n,theta,alpha,sigma,x0){# n is the number of time units to simulate for, theta is the optima, alpha is the pull towards the optima, sigma is the rate and X0 is the starting trait value dw <- rnorm(n, 0) x <- c(x0) for (i in 2:(n+1)) { x[i] <- x[i-1] + alpha*(theta-x[i-1]) + sigma*dw[i-1] } return(x); } # we can now use this function to explore the effect that changing various parameters will have on the distribution of the trait value alpha1sigma0.01theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 1, 0.01, 0)) # replicate just runs it 10 times for us to illustrate the variance between runs colr<-topo.colors(9) plot(alpha1sigma0.01theta1[,1], type="l", main="alpha1 sigma=0.01", ylim=c(-2, 2)) for(i in 2:ncol(alpha1sigma0.01theta1)) lines(alpha1sigma0.01theta1[,i], type="l",col=colr[i-1] ) # play with different parameters. alpha0.5sigma0.01theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0.5, 0.01, 0)) alpha0.1sigma0.01theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0.1, 0.01, 0)) alpha0sigma0.01theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0, 0.01, 0)) alpha1sigma0.1theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 1, 0.1, 0)) alpha0.5sigma0.1theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0.5, 0.1, 0)) alpha0.1sigma0.1theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0.1, 0.1, 0)) alpha0sigma0.1theta1<-replicate(10, ornstein_uhlenbecksim(100, 1, 0, 0.1, 0)) par(mfrow=c(4,2)) plot(alpha1sigma0.01theta1[,1], type="l", main="alpha1 sigma=0.01", ylim=c(-2, 2)) for(i in 2:ncol(alpha1sigma0.01theta1)) lines(alpha1sigma0.01theta1[,i], type="l",col=colr[i-1] ) plot(alpha1sigma0.1theta1[,1], type="l", main="alpha1 sigma=0.1", ylim=c(-2, 2)) for(i in 2:ncol(alpha1sigma0.1theta1)) lines(alpha1sigma0.1theta1[,i], type="l",col=colr[i-1] ) plot(alpha0.5sigma0.01theta1[,1], type="l", main="alpha0.5 sigma=0.01", ylim=c(-2, 2)) for(i in 2:ncol(alpha0.5sigma0.01theta1)) lines(alpha0.5sigma0.01theta1[,i], type="l",col=colr[i-1] ) plot(alpha0.5sigma0.1theta1[,1], type="l", main="alpha0.5 sigma=0.1", ylim=c(-2, 2)) for(i in 2:ncol(alpha0.5sigma0.1theta1)) lines(alpha0.5sigma0.1theta1[,i], type="l", col=colr[i-1] ) plot(alpha0.1sigma0.01theta1[,1], type="l", main="alpha0.1 sigma=0.01", ylim=c(-2, 2)) for(i in 2:ncol(alpha0.1sigma0.01theta1)) lines(alpha0.1sigma0.01theta1[,i], type="l", col=colr[i-1] ) plot(alpha0.1sigma0.1theta1[,1], type="l", main="alpha0.1 sigma=0.1", ylim=c(-2, 2)) for(i in 2:ncol(alpha0.1sigma0.1theta1)) lines(alpha0.1sigma0.1theta1[,i], type="l", col=colr[i-1] ) plot(alpha0sigma0.01theta1[,1], type="l", main="alpha0 sigma=0.01", ylim=c(-2, 2)) for(i in 2:ncol(alpha0sigma0.01theta1)) lines(alpha0sigma0.01theta1[,i], type="l", col=colr[i-1] ) plot(alpha0sigma0.1theta1[,1], type="l", main="alpha0 sigma=0.1", ylim=c(-2, 2)) for(i in 2:ncol(alpha0sigma0.1theta1)) lines(alpha0sigma0.1theta1[,i], type="l", col=colr[i-1] ) # Once again the above simulations were assuming the species were independent but we know that they are not, they share evolutionary history and this affects the evolution of their traits. So when simulating along a phylogeny time is also an important component, to be able to account for this you need to include time and dt back into the equations. Fortunately, you don't need to be able to do this yourself as the R package OUwie has a simulation function OUwie.sim.