Applications are now being accepted for the 2017 Workshop in Applied Phylogenetics. This year’s workshop will run from March 11 to 18 at the Bodega Bay Marine Lab on the northern California Coast. The application deadline is January 31st. See the 2017 workshop page for more information and instructions to apply.
Applications are now being accepted for the 2015 Workshop in Applied Phylogenetics. This year’s workshop will run from March 7 to 14 at the Bodega Bay Marine Lab on the northern California Coast. The application deadline is January 10th. See the 2015 workshop page for more information and instructions to apply.
“What I cannot create, I do not understand.” Richard Feynman
This series of posts is intended to be a hands-on R-based companion to some of the other things our contributors discuss. We might delve deeper into the behavior of the gamma distribution (or any of the many probability distributions popular in phylogenetics), code up an MCMC algorithm, or work through Felsenstein’s pruning algorithm, to name a few exercises. Playing around with these things in R, even in a simple way, can bring understanding that reading the primary literature or staring at Wikipedia cannot.
I hope this series sheds light on some of the more black-boxy aspects of statistical phylogenetics, and also helps beginning R users develop good programming habits. I invite others to contribute to the series as much as they’d like. I assume that most readers have a rudimentary understanding of R, as in have the ability to open their R GUI (or favorite IDE), write a script, and execute it.
As an initial post, I will first provide a very rough sketch of some of the salient features of the R language (with a small dose of personal opinion), introduce some good practices for writing in R, and then make sure readers are up to speed on writing functions, using
for loops and
apply-like functions, and the supremely important concept of vectorization. A basic understanding of these topics will help you navigate the code that I (and others) write and should form a solid foundation for writing your own scripts.
What’s the deal with R anyways?
R is a flexible, extensible programming language with a relatively gentle learning curve. These days, it seems to be the go-to language for young biologists with little background in computer science (like me, for certain values of young) who are trying to put together their own analyses. R code can be executed line-by-line, which makes writing software much easier for people who are not used to assembling a (buggy) program from scratch.