Discussion About Object-Oriented Design At Carl Boettiger’s Blog, Plus Notes On “Broken” Tutorials

A blessing and a curse of the R programming interface is that methods development occurs at a community scale. Progress tends to occur in a piecemeal fashion as developers make small advances within a massive, highly integrated, but often chaotically organized community. The great thing about R is that so many people are doing so many things in a modular framework that nearly any sort of analysis is possible. The drawback is that anytime there’s a flaw or a fundamental change in a given module (e.g., an R package, or a particular function), it can have widespread and unanticipated effects on any package that depends on it.

Over at his website/blog, Carl Boettiger points out that this problem is exacerbated by the fact that object-oriented design is rare in R, and as a result, changes in the underlying mechanics of functions tend to result in critical changes in the nature of their output, which spells trouble for dependencies. Carl does a nice job explaining how the use of object-oriented design can prevent such problems.

The context of Carl’s post is that the authors of the popular “geiger” R package have just released a major update that changes to many functions, function names, and outputs. According to Carl, most of these changes are great, but the rub is that several cause failures in packages, scripts, or tutorials that depend on “geiger”. These include Carl’s excellent “pmc” package, Travis Ingram‘s “surface” package, and perhaps most relevant to the Bodega crowd…   several of our R phylogenetics tutorials! I’ve gotten several emails over the past week or so mentioning that these tutorials no longer work due to the “geiger” change.

If you’re interested in this sort of thing, check out Carl’s post, as well as the lengthy discussion it’s generated in his comments section. As for our tutorials, I’ll try to post new versions of the ones I wrote for continuous traits shortly. Most of the updates will be straightforward (thanks to Jeremy Brown for highlighting the necessary changes when he first wrote me about it), but those involving “pmc” will likely have to wait until Carl releases an update that’s compatible with the new “geiger”. Of course in the meantime, it should be possible to run the tutorials using older versions of “geiger”.

Evolution Deadline

A quick note that the early registration deadline for this year’s joint annual meeting of SSE, SSB, and ASN is fast approaching. Early registration ends on Friday, April 19th. I’m really looking forward to this one. It’s in a beautiful location and should nicely avoid that convention center wasteland feel that many meetings unfortunately have these days. I’m sure that many people associated with the Bodega workshop will be there. Who’s going?


Updated BEAST Tutorial

I really enjoy teaching and participating in phylogenetics workshops. Currently, I’m preparing my teaching materials for the Wellcome Trust-EMBL-EBI Advanced Course on Computational Molecular Evolution, where I have the awesome opportunity to teach a section on divergence time estimation with Jeff Thorne. Since I’ve made some minor updates to the BEAST tutorial that I’ve given at recent workshops, I wanted to create a more permanent page to host the document and data files. So, for those interested, you can find the updated tutorial here. I will try to keep this tutorial as up-to-date as possible.

About TreeThinkers

TreeThinkers is a blog devoted to phylogenetic and phylogeny-based inference. We aim to use it as a place to discuss recent research and methods; to ask and answer questions; and serve as a general resource for news and trivia in phylogenetics. We have already had several posts since the 2013 Bodega Workshop ended and we plan to keep things going strong. This post is meant to give a (belated) introduction to the blog and provide some general information about how things work around here.

The group of us that organize the Bodega workshop have been talking about developing a blog associated with the course for a few years. Following the switch from our increasingly clunky wiki to this shiny new site, we’ve decided that the time was ripe….and here we are!    Several of us Bodega instructors have signed on as contributors, and we welcome guest posts or regular contributions from the rest of the community. Get in touch with me if you’re interested.

As a blog associated with a course, one of our central focuses will always be on teaching. Brian Moore has a running series of posts called MCMC Corner that discusses various aspects of Bayesian inference. Looking at his drafts of upcoming posts, this looks to be an informative and useful set of articles. We will also be posting on general topics of interest,  best practices or common sources of confusion encountered in phylogenetic analysis. Rich Glor’s tutorial explaining the various parameterizations of the Gamma distribution is a great example. Finally, we’ll post about recent findings, news, and announcements that are relevant to phylogenetics. One of our major goals here is to lower the learning curve associated with phylogenetic inference. With that goal in mind, if you have a question, ask it! Leave a comment, tweet @treethinkers, email me, or email one of the contributors and we’ll do our best to answer it in a post or a tutorial.
Updated TreeSetViz

coverBetween the Bodega Bay workshop and my spring semester class on computational phylogenetics at LSU, I’ve been talking a lot about phylogenetic analyses recently.  In my opinion, one of the most underutilized approaches for summarizing phylogenetic information is to visualize collections of trees in tree space (or an approximation of tree space projected into 2 or 3 dimensions).  These visualizations can help with diagnosing MCMC convergence problems, comparing the phylogenetic signal coming from different genes, picking an appropriate burn-in, and a whole bunch of other stuff.  I widely recommend such visualizations to students.  There are some drawbacks (e.g., lots of information can be lost during the projection to such low-dimensional spaces), but on balance it seems to me that the advantages outweigh these drawbacks.  There really isn’t another way to summarize that amount of phylogenetic information in a single plot.  As far as I know, the first paper to describe this approach was by Hillis, Heath, and St. John (2005, Syst. Biol., 54: 471-482), which included the release of a Mesquite plug-in called TreeSetViz for performing such visualizations using multi-dimensional scaling (MDS). I used this tool quite a lot and was disappointed that it became incompatible with newer releases of Mesquite.  In fact, I maintained an outdated version of Mesquite just to run TreeSetViz.  Well, unbeknownst to me, an updated version of TreeSetViz was released last year that is compatible with the most recent versions of Mesquite (hat tip to Vinson Doyle for pointing this out) and includes a wide variety of new options for comparing and visualizing trees.  Strangely, this new release isn’t mentioned on the original TreeSetViz website, but it is announced, along with very simple installation instructions, on the Mesquite website.  More recently, additional tools that are independent of Mesquite (TreeScaper) have been developed that also perform tree set visualization and projection.  Give them a try!

New Website

Five years ago, I put together the first iteration of the Bodega Workshop’s website as a dirt-simple wiki hosted through the (Davis, CA-based) wikispot project. Our initial goals for the website were simple: provide an editable set of course materials for use during the workshop and accessible review materials for course participants afterward. A wiki-based system seemed ideal for this, instructors could easily add new content and (more importantly) the students could fix our errors. To get it launched that first year, Rich Glor, Brian O’Meara and I spent a big chunk of the workshop simply gathering up content and dumping it into the wiki framework. As the years have gone by, many people associated with the course have all pitched in to add content, revise outdated parts, and just clean things up. It’s been a great group effort.

About a year after first launching the website, we noticed that it was starting to get a fair bit more traffic than would be expected based on the number of people involved with the course. Since that time, the usership for the site has continued to grow and now averages a couple of hundred visitors a day—a small number in the scheme of things, but a much larger number than any of us initially expected. We have also been happy to see that the user community for the website is highly international (it varies through time, but the proportion of visits from outside of the US typically hovers around 50%).  Its good to see that course materials developed in the workshop get such wide use. 

Average daily visitors from 2008 to present. Spikes and dips associated with the March workshop and winter holidays are obvious, but what’s going on with those pre-holiday spikes in traffic?

Over the last five years, the content of the website has grown from that first single page to more than 200 and maintenance has become more unwieldy within the confines of the original wikispot framework. Furthermore, as fond as I am of the retro 90s “sea green and periwinkle” look that the website sports, I’ll grudgingly admit that we are far past due for a refresh. To remedy both of these issues, I’ve moved the site over to WordPress. Rich Glor kindly donated the use of our new domain, and I’m handling the hosting on my lab website’s server. A potential downside to this change is that we lose the wiki functionality where anyone can log on and make changes. That said, the group of active editors for the old wiki site has never been particularly huge, so this won’t have a large impact on site maintenance.

We of course still welcome any additional contributors to the site, whether you’re involved with the workshop or not. Please use the comments to note important typos or errors. I’ll be keeping an eye on things and will incorporate minor changes as necessary. Better still, get in touch with me and I’ll make an account for you. This way you can join the ranks of regular editors to the website. We would particular welcome input from anyone who has an interest in contributing new tutorials to the site or would like to contribute blog-style posts about emerging topics or other goings-on in phylogenetics. The course website gets used by a whole lot more people than attend the course, so it only makes sense that this wider community helps shape the content. In particular, I’d like to encourage former Bodega students to pitch in and contribute to the site, there’s around 400 of us out there at this point.

The old wiki site will stay up for a while, but won’t be updated anymore. Please update any bookmarks/RSS feeds to point to this site.

In other news, the 2013 workshop is now only a month away! We should see a flurry of updates from workshop planners, Brian Moore and Peter Wainwright; the workshop coordinator, Gideon Bradburd; and the rest of the instructors soon.

2013 Bodega Applied Phylogenetics Workshop


Phylogenetic methods have revolutionized modern systematics and become indispensable tools in evolution, ecology and comparative biology, playing an increasingly important role in analyses of biological data at levels of organization ranging from molecules to ecological communities. The estimation of phylogenetic trees is now a formalized statistical problem with general agreement on the central issues and questions. A nearly standard set of topics is now taught as part of the curriculum at many colleges and universities. On the other hand, application of phylogenetic methods to novel problems outside systematics is an area of special excitement, innovation, and controversy, and perspectives vary widely.

The course will be held at the Bodega Marine Laboratory on the Northern California coast, which has on-site housing. Our newly increased bandwidth and access to computing clusters allows us to utilize computer-intensive approaches even in a one-week course. The course format will involve equal parts of lecture, discussion, and hands-on software training. One afternoon during the week will be left free for field trips to local natural areas.

Topics Covered

  • Estimating, evaluating and interpreting phylogenetic trees
  • Recent advances in Bayesian and Maximum-likelihood estimation of phylogeny
  • Estimation of species trees, gene-tree/species-tree conflicts
  • Divergence-time estimation from sequence data: relaxed clocks, fossil calibration
  • Analysis of character evolution: maximum likelihood and Bayesian approaches, ancestral-state estimation, character correlation, rates of trait evolution
  • Analysis of morphological form, function of complex character systems
  • Inference of diversification rates: detecting rate shifts, testing key innovation hypotheses
  • Model specification issues: model selection, adequacy and uncertainty
  • Diagnosing MCMC performance

Instructors for the 2013 workshop

  • Jeremy Brown
  • Jonathan Eisen
  • Rich Glor
  • Tracy Heath
  • Mark Holder
  • John Huelsenbeck
  • Luke Mahler
  • Brian Moore
  • Samantha Price
  • Bruce Rannala
  • Bob Thomson
  • Peter Wainwright

Plus special guest lecturers!!


Available housing limits course enrollment to ~30 students. Preference is given to doctoral candidates who are in the early to middle stages of their thesis research, and who have completed sufficient prerequisites (through previous coursework or research experience) to provide some familiarity with phylogenetic methods. Unfortunately, because of limits on class size, postdocs and faculty are discouraged from applying.

Admission and Fees

Students will be admitted based on academic qualifications and appropriateness of research interests. The course fee is $650. This includes room and board at BML for duration of the course (arriving March 2, leaving March 9) and transportation from Davis to BML.

Application Deadline

Applications are due by November 16, 2012. Please send a completed application form and one letter of recommendation from your major advisor. Applications should be sent via email as PDFs to Students will be notified via e-mail by December 1, 2012 of acceptance.


FAQ page with questions.

Send all application materials to:

Gideon Bradburd
Department of Evolution and Ecology
5343 Storer Hall
University of California Davis
Davis, CA

Course Organizers
Brian Moore and Peter Wainwright