Evolution of Discrete Binary Traits in BayesTraits


Adoxa moschetaliana (image from http://www.minnesotawildflowers.info/flower/moschatel)

Image from http://countrystoreplants.com/

Vibernum sargentii (image from http://countrystoreplants.com/)

The goal of this tutorial is to learn how to use Pagel’s BayesTraits program to conduct basic analyses of character evolution. This tutorial requires three files, which contain the trees we’re going to analyze (Adox.trees (remove .txt extension after downloading), the trait data for taxa on this tree (Adox.trait1), and a set of commands that will facilitate analyses in BayesTraits (Adox_trait1_commands). For this tutorial, I use the Mac OSX version of Bayes Traits, but the process is very similar on a PC. In OSX, BayesTraits is run through the Terminal application, which is installed along with the operating system.

1. Downloading and installing BayesTraits: Download BayesTraits and uncompress the resulting file. You should now have a folder called BayesTraits on your computer. As with any Terminal-based application, it is critical to know where the BayesTraits folder and associated application are located on our computer. To keep things as simple as possible, we’re going to move the BayesTraits folder from wherever it was initially downloaded to our operating system’s Applications directory. Because your mac may have multiple Applications directories, you should locate the one found in your root directory (the simplest way to get to this directory is to click the Applications link in the Favorites menu that appears in the Finder window (see Fig. 1)). Once you have identified the correct Applications directory, move the BayesTraits folder to this directory.


Fig. 1: Locating the Applications directory in OSX.

After ensuring that BayesTraits is readily accesible from our Applications directory, we’re going to take one more simple step that will greatly simplify our subsequent work with this program. By putting the BayesTraits folder in our operating system’s path, we will tell our computer where to look for BayesTraits when we attempt to run this program from the Terminal window. To put BayesTraits in your path, open the Terminal application and type vi .profile. This opens a hidden file called “.profile” in the vi editor. Once .profile is open, type i to begin editing (you should see “—Insert—” at the bottom of the Terminal window (see Fig. 2). Now simply type export PATH="$PATH:/Applications/BayesTraits/" at the top of this file to add BayesTraits to your path. When finished, press esc to stop editing and then save and close the .profile file by typing :wq.


Fig. 2: Adding BayesTraits to our path.

2. Obtain tree and character data for analyses in BayesTraits: With BayesTraits now included in our operating system’s path, we’re almost ready to run some analyses. Before doing so, however, we get to get some data. BayesTraits requires rooted trees with branch length data. We will therefore obtain ultrametric trees for Adoxaceae from the Bayesian relaxed clock algorithms implemented in BEAST. The file Adox.trees contains hundreds of trees sampled from the posterior distribution of this BEAST analyses. By conducting our analyses in BayesTraits across this set of trees, our analyses will incorporate topological and branch length uncertainty. In addition to trees, we also need  some character data. BayesTraits accepts simple comma delimited text files generated with a spreadsheet editing program like Microsoft Excel or a simple text editor. Our data on growth form in the Adoxaceae are stored in a comma delimited text file called Adox_trait1.txt, where woody and herbaceous growth forms are coded with 0s and 1s, respectively. Create a new folder on your Desktop called Adox_Character_Evolution and place the trees and character data for the Adoxaceae in this folder.

3. Upload Data and Begin BayesTraits Analyses: Open the Terminal Window and navigate to the Adox_Character_Evolution folder you just created on your desktop. To do this, first type cd at the Terminal prompt. Now, using your mouse, click on the BayesTraits folder in a Finder window and hold down the left mouse button as you drag this folder next to the cd you just typed into the terminal before releasing the mouse button; this should automatically fill in the path to the folder containing your data, resulting in a line that reads cd Desktop/Adox_Character_Evolution/. Hit return to move to this folder. Because we previously put BayesTraits in our path, we now just need to call the program from the prompt in the Terminal with a simple one word command and tell it which tree and trait files we want it to analyze, in that order:

BayesTraits Adox.trees Adox.trait1.txt

If your data have loaded successfully you should see the following text:

Rand Seed    1299555524 
Please Select the model of evolution to use.
1)      MultiState.
2)      Discrete: Independent
3)      Discrete: Depend
4)      Continuous: Random Walk (Model A)
5)      Continuous: Directional (Model B)

This is BayesTraits‘ way of prompting us to select the type of analysis we’d like to do. Although we’re dealing with a binary trait, we’re going to select the Multistate option because the Discrete option in BayesTraits is designed to investigate character correlations and requires two characters with binary coding (we have only one character with binary coding in our dataset). Type 1 and hit return to choose the Multistate option.

We’re now prompted to choose whether we want to do Maximum Likelihood or Bayesian MCMC:

Please Select the analsis method to use. 
1)      Maximum Likelihood.
2)      MCMC

We’re going to start with ML analyses, even though our ultimate goal is to use Bayesian inference. We’re doing this because ML can provide us with a reasonable first approximation of the rate coefficient(s) that might be useful for specifying (hyer)priors in the subsequent Bayesian analysis. Type 1 and hit enter to initiate your Maximum Likelihood analyses.

4. Investigate preliminary output from BayesTraits: BayesTraits should now provide us with some basic information about our dataset and model settings:

Model:                           Multistates
Tree File Name:                  Adox.trees
Data File Name:                  Adox_trait1.txt
Log File Name:                   Adox_trait1.txt.log.txt
Summary:                         False
Analsis Type:                    Maximum Likelihood
ML attempt per tree:             10
No of Rates:                     2
Base frequency (PI's)            None
Character Symbols                0,1
Using a covarion model:          False
    q01                          None
    q10                          None
Tree Information
     Trees:                      501
     Taxa:                       63
     Sites:                      1
     States:                     2

Most of the information we’re seeing here has a pretty intuitive interpretation. Take a moment to read and understand this output. The first few lines tell us the names of the files we’ve input, as well as the name of the log file that’s going to store our results. This output also tells us the type of analyses we’re using, the number of times that analyses will be attempted on each tree (10), the number of rates being investigated (2), the symbols used for alternative character states (0,1), any rate restrictions we’ve imposed, and some basic information about our trees and data. Notice that we have 501 trees with 63 taxa in each tree and 1 character (or site) with two states (i.e., a binary character). Because we have a two state character, we’re going to begin our investigation by considering a two rate model that includes each of the two possible types of transitions that might occur between our two character states; one for transitions from woody to herbaceous (q01) and the other for transitions from herbaceous to woody (q10). We’re going to do a basic ML search under this two-rate model using the default settings for other parameters. When doing Maximum Likelihood analyses on a set of more than 1 tree, BayesTraits is going to provide results calculated independently for each of the 501 trees in our dataset.

To run your analysis, simply type run.

Output for the first 10 trees in our sample should look something like this:

Tree No      Lh      q01     q10     Root P(0)       Root P(1) 
1       -9.727674       0.000000        0.017002        0.000000        1.000000
2       -10.138425      0.000000        0.018218        0.000000        1.000000
3       -10.311666      0.000000        0.017754        0.000000        1.000000
4       -10.437176      0.000000        0.020893        0.000000        1.000000
5       -9.824290       0.000000        0.017578        0.000000        1.000000
6       -9.924709       0.000000        0.019461        0.000000        1.000000
7       -10.249299      0.000000        0.020936        0.000000        1.000000
8       -10.254006      0.000000        0.022155        0.000000        1.000000
9       -10.257945      0.000000        0.020684        0.000000        1.000000
10      -10.494423      0.000000        0.020988        0.000000        1.000000

Let’s focus on several aspects of these results. First, we should see that all of our trees produce similar log likelihoods. This suggests that variation among the trees in our posterior distribution is not having a strong impact on the inference of evolution in this trait. Second, the rate for transitions from the woody to herbaceous (q01 = 0.000) is much lower than that of herbaceous to woody (q10 = 0.018-0.022). Finally, the probability of an herbaceous state at the root is substantially higher than the probability of a woody state at the root (0.000 v. 1.000).

5. Test Alternative Models of Character Evolution: Before we leave likelihood land, we could consider whether the difference in the two estimated rate parameters has implications for model selection. To do this we’re going to re-run our analysis with the two parameters constrained to be equal to one another. To do this, we’re going to use the command restrict q01 q10 (just before entering run), remember you have to specify the matrix, data and parameters for each run. We can confirm that this operation was successful by typing info. We should now see that q01 is restricted to be the same as q10 while q10 has no restrictions. Let’s now re-run the analysis by typing run. The first ML scores for the first 10 trees look something like this:

Tree No      Lh      q01     q10     Root P(0)       Root P(1) 
1       -13.946363      0.005398        0.005398        0.454504        0.545496
2       -14.400731      0.005738        0.005738        0.457616        0.542384
3       -14.588476      0.005608        0.005608        0.459535        0.540465
4       -14.873274      0.006408        0.006408        0.445273        0.554727
5       -14.165872      0.005521        0.005521        0.439266        0.560734
6       -14.396200      0.005913        0.005913        0.438098        0.561902
7       -14.479001      0.006861        0.006861        0.439984        0.560016
8       -14.641807      0.006948        0.006948        0.443282        0.556718
9       -14.595005      0.006512        0.006512        0.445936        0.554064
10      -14.833338      0.006706        0.006706        0.427272        0.572728

The fact that the resulting substantial decrease in likelihoods suggests that the two-rate parameter model is superior to the single-rate parameter model. We could formally test this hypothesis using AIC or the likelihood ratio test (with one degree of freedom).

6. Conduct Preliminary Bayesian Analyses: Now that we have a rough idea of what our data look like, let’s move to MCMC world. Let’s get started with this analysis./BayesTraits Adox.trees Adox_trait1.txt. As before, we’re going to select Multistate. However, when we get the next prompt, we’re going to choose option 2 (MCMC) rather than option 1 (Maximum Likelihood). When you do this, you get different information on your starting parameters that you do with Maximum Likelihood:

Model:                           Multistates
Tree File Name:                  Adox.trees
Data File Name:                  Adox_trait1.txt
Log File Name:                   Adox_trait1.txt.log.txt
Summary:                         False
Analysis Type:                   MCMC
Sample Period:                   100
Iterations:                      5050000
Burn in:                         50000
Rate Dev:                        2.000000
No of Rates:                     2
Base frequency (PI's)            None
Character Symbols                0,1
Using a covarion model:          False
    q01                          None
    q10                          None
Prior Information:
    Prior Categories:            100
    q01                          uniform 0.00 100.00
    q10                          uniform 0.00 100.00
Tree Information
     Trees:                      501
     Taxa:                       63
     Sites:                      1
     States:                     2

7. Modify Basic Parameters for Bayesian Analyses: We’re also going to reduce the number of iterations so that things run a bit faster than they would otherwise. Let’s drop down to 100,000 iterations by typing Iterations 100000. We’re going to begin by leaving the prior parameters at their default settings for now and simply run the analysis with uniform rate priors by typing run. The first few lines of the resulting output should look something like this:

Iteration    Lh      Harmonic Mean   Tree No q01     q10     Root P(0)       Root P(1)       Acceptance 
50000   -21.583288      -21.583288      95      10.550017       62.602856       0.500000        0.500000        0.940000
50100   -21.368018      -21.516551      103     5.856247        55.458663       0.500000        0.500000        0.960000
50200   -21.340524      -21.454796      313     8.247111        57.741515       0.500000        0.500000        0.950000
50300   -21.803621      -21.498049      22      4.482239        55.257891       0.500000        0.500000        1.000000
50400   -21.660432      -21.555659      176     10.657151       61.080610       0.500000        0.500000        0.910000
50500   -21.453740      -21.556891      390     10.324308       65.802350       0.500000        0.500000        0.980000
50600   -21.591820      -21.552075      383     11.692942       69.103778       0.500000        0.500000        0.920000
50700   -21.558930      -21.555231      260     12.985093       77.978155       0.500000        0.500000        0.990000
50800   -21.781473      -21.570270      455     5.967007        72.842356       0.500000        0.500000        0.890000
50900   -21.499887      -21.579031      106     6.850427        72.032596       0.500000        0.500000        0.920000
51000   -22.844072      -21.690110      1       18.280377       76.235973       0.500000        0.500000        0.910000
51100   -23.511776      -21.967507      232     19.162965       71.179811       0.500000        0.500000        0.870000

Note that this output includes the iteration in the left column as well as the Harmonic Mean estimate of the marginal log likelihood that we’ll need later to calculate Bayes Factor scores, and indicates which tree is being sampled at each iteration. Note that the likelihood scores are considerably worse than they were previously, that’s not surprising because we are no longer maximizing the likelihood. The most disturbing aspect of the results is that the transition rates are crazy high. This is not too surprising, given that the prior (uniform [0,100]) places the vast majority of the prior mass is on very high rates. Accordingly, we’ll modify the prior to reflect more reasonable; specifically, we’ll change the prior to a uniform [0,1]. This prior is more reasonable in light of the range of MLE values for these parameters (this approach is an example of an ’empirical prior’—or use of a prior that is informed by a ML ‘peek’ at the data). Run the analysis under this new prior and check out your results.

8. Optimize Mixing of Bayesian MCMC: Ideally, the acceptance rates to for proposed changes to the rate parameters should fall in the range between ~20-40%. We can control the acceptance rates (and, thus, the mixing of the MCMC) by changing the magnitude of proposed changes to the transition rates (i.e., RateDev parameter). All else being equal, we expect higher Rate Dev values to result in lower acceptance rates because they will propose larger changes to the rate parameters. Accordingly, if acceptance rates are too high, specify a larger value for the RateDev parameter (and vice versa). Experiment with different values until you get the chain to mix well.

9. Test Alternative Models of Trait Evolution Using Bayesian Inference: Now we’re going to repeat the above Bayesian analyses under a one-rate model. To do this, we’re going to use the commandrestrict q01 q10 (just before entering run). We can confirm that this operation was successful by typing info. We should now see that q01 is restricted to be the same as q10 while q10 has no restrictions. Let’s now re-run the analysis by typing run. As before, iteratively adjust the value of the RateDev parameter until the acceptance rates are in the target range (2-–40%).

We have estimates of the marginal likelihoods (based on the harmonic mean estimator) for each of the preceding Bayesian analyses (under the two-rate and one-rate models). We can therefore assess the support in the data for the two models (by taking the difference in the estimated marginal log likelihoods). Which model is preferred?

10. Test Alternative Models of Trait Evolution Using Reversible Jump MCMC: The two previous Bayesian analyses conditioned on a model—the first on a two-rate model, the second on a one-rate model. If we want to get really Bayesian, we can treat the model itself as a random variable. That is, we can use a special type of MCMC, reversible-jump MCMC to average over the set of possible trait models (for binary traits, there are a total of 5 possible models). This procedure accommodates uncertainty in model choice so the the inference of ancestral states and transition rates does not assume any one model of trait evolution. Let’ specify a model-averaged inference of trait evolution by specifying the prior and hyperpriorm, rjhp exp 0 30, for the rjMCMC—this specifies an exponential prior for the transition rates, and the rate parameter for the exponential is itself sampled from a uniform hyperprior on the interval [0,30] . After running this analysis, we should see some lines that look like this:

Iteration    Lh      Harmonic Mean   Tree No No Off Paramiters       Model string    q01     q10     Root P(0)       Root P(1)       RJ Prior Mean   Acceptance 
50000   -10.345161      -10.345161      207     1       'Z0     0.000000        0.010113        0.000000        1.000000        3.301091        0.090000
50100   -9.419777       -10.120508      260     1       'Z0     0.000000        0.025599        0.000000        1.000000        0.703027        0.120000
50200   -9.617302       -9.922255       455     1       'Z0     0.000000        0.021933        0.000000        1.000000        0.907404        0.080000
50300   -10.040932      -9.902513       124     1       'Z0     0.000000        0.021933        0.000000        1.000000        0.998135        0.080000
50400   -10.507461      -10.005833      199     1       'Z0     0.000000        0.031622        0.000000        1.000000        0.210905        0.160000
50500   -10.099262      -10.071735      116     1       'Z0     0.000000        0.012638        0.000000        1.000000        0.303869        0.090000
50600   -9.741163       -10.052036      117     1       'Z0     0.000000        0.014276        0.000000        1.000000        0.261923        0.090000
50700   -10.413453      -10.063185      306     1       'Z0     0.000000        0.032846        0.000000        1.000000        0.287645        0.120000
50800   -11.500539      -10.256669      349     1       'Z0     0.000000        0.039525        0.000000        1.000000        2.166462        0.100000
50900   -10.339216      -10.382841      94      1       'Z0     0.000000        0.014670        0.000000        1.000000        3.414595        0.120000
51000   -10.450154      -10.384123      87      1       'Z0     0.000000        0.008764        0.000000        1.000000        4.106023        0.090000
51100   -11.107678      -10.432085      249     1       'Z0     0.000000        0.040083        0.000000        1.000000        9.399662        0.150000

Again, lets iteratively tweak the RateDev parameter to ensure that the rjMCMC is mixing well—the current acceptance rates are a bit low (<20%). Let’s try to improve mixing by reducing the RateDev parameter, setting it to 0.1 RateDev 0.1. If we do this, we find that we have good mixing by the end of the analysis, but that we have yet to reach stationarity by the previously set burn-in. In this case, therefore, we’re going to need to increase the burn-in point to ensure adequate mixing.

Iteration    Lh      Harmonic Mean   Tree No No Off Paramiters       Model string    q01     q10     Root P(0)       Root P(1)       RJ Prior Mean   Acceptance 
50000   -21.746725      -21.746725      377     2       '01     0.865084        4.784077        0.500000        0.500000        3.008918        0.870000
50100   -21.549190      -21.685116      33      2       '01     0.778098        4.692790        0.500000        0.500000        3.066121        0.820000
50200   -21.350093      -21.599172      336     2       '01     0.639613        4.431222        0.500000        0.500000        4.419139        0.820000
50300   -21.303282      -21.528656      203     2       '01     0.599025        4.426538        0.500000        0.500000        1.618403        0.830000
50400   -21.286093      -21.481195      20      2       '01     0.594383        4.614545        0.500000        0.500000        2.442129        0.860000
50500   -21.487544      -21.465539      38      2       '01     0.746834        4.661760        0.500000        0.500000        2.790109        0.810000
99400   -9.979530       -23.099905      303     1       'Z0     0.000000        0.013126        0.000000        1.000000        0.116057        0.480000
99500   -11.423252      -23.097885      334     1       'Z0     0.000000        0.053344        0.000000        1.000000        1.404885        0.340000
99600   -11.333112      -23.095868      247     1       'Z0     0.000000        0.052508        0.000000        1.000000        0.057220        0.390000
99700   -10.792404      -23.093856      449     1       'Z0     0.000000        0.044798        0.000000        1.000000        0.243009        0.360000
99800   -10.967647      -23.091848      372     1       'Z0     0.000000        0.035442        0.000000        1.000000        2.761527        0.360000
99900   -10.462960      -23.089844      312     1       'Z0     0.000000        0.010897        0.000000        1.000000        2.425204        0.530000
100000  -12.391982      -23.087844      158     1       'Z0     0.000000        0.003631        0.000000        1.000000        0.085852        0.360000

We can increase the burnin of subsequent runs using the burnin command. Try increasing the burnin so that it is beyond the point of burnin in the analysis you just ran.

 The rjMCMC simulation provides an estimate of the marginal probabilities of the various trait models. What are the marginal probabilities for the one- and two rate models? How does this result jive with the results of the Bayes factor test that you performed previously?

13. Assess Impact of Root State On Character Evolution: Suppose we are particularly interested in knowing what the growth form was at the root of our tree. As we’ve already seen, our analyses strongly favor a herbaceous growth form at the root. We can assess how well supported this outcome is by using the fossil command. By fixing the state at each possible alternative and running the analysis, we can compare the resulting Bayes Factor scores to generate an estimate for the relative support of each alternative hypothesis. Let’s fix the root to be a woody form Fossil root 0 1 12, ie set the root to state 0 (woody) between taxa number 1 and 12. The syntax here requires that fossil [node name] [state to fix] [taxon numbers]. Once this operation is complete, we can re-run a second time after fixing a root state of 1. Compare the resulting Harmonic mean likelihoods to see how well supported each alternative state is.

Leave a Reply