Grant deadlines for DEB are coming up and this has me thinking about the best way to go about actually doing the computation that I’m proposing to do. Since my lab is still in its early “get up and running” phase, I’m also in a position to invest in new resources and set up some standard operating procedures for the future. This is an issue that all phylogeneticists struggle with at one point or another, so I thought it would be useful to poll the community. What do you use for big analysis jobs in your lab?
Like many people in my generation of phylogenetics, I started out in the days of taping warning notes to monitors (Figure 1) (i.e., cobble together whatever desktop machines one can get hands on…and then jealously guard them from
the enemy lab-mates for the months it takes your analysis to finish). Times have obviously changed since then and we aren’t, as a field, nearly as computationally limited as we were 10 or even 5 years ago. Many free or easily accessible computing options are now available: CIPRES, iPLANT discovery environment, Amazon’s EC2, XSEDE (formerly TeraGrid), and any number of university/college/departmental clusters…and that’s not to mention the homebuilt clusters and trusty (dusty) desktops sitting in the corners of our labs. The workhorse software of our field is also faster than it used to be, allowing us to get more done in the same amount of time, irrespective of the hardware being used.
For the last few years, I’ve enjoyed the benefit of having my own small but speedy cluster (built cheaply using commodity parts), as well as a TeraGrid allocation. These have worked well for my needs: they’ve allowed me to get analyses finished in a timely fashion; run many tests and toy analyses without feeling limited; lend time to coworkers in a pinch…and aside from all that, the cluster allows for lots of satisfying tinkering during off hours. All that said, the TeraGrid allocation is now finished, the large cluster on Maui that I’d been hoping to get an allocation on is no longer available, and I’m already seeing that the tropical climate here on Oahu is hell on hardware (e.g. my monitor fills up with condensation anytime I leave it off for more than a day or two). I’m thinking about eventually moving completely into EC2 and XSEDE and not having to worry about hardware at all.
I’d appreciate learning about the experiences that others have had. What is your preferred solution for phylogenetic computing?