My PhD explained using only the 1000 most commonly used words

A dreaded question for any PhD student (in my field at least) is a non-specialist (often family or friend) asking something like… “so what is it you actually do?”. Maybe I’ll develop a stock response over the next year or two but for now there’s just lots of umming and ahhing until the subject is changed.

I suspect a lot of PhD students feel the same way, hence the uptake of the #UpGoerFive hashtag where, in deference to a great xkcd, people have taken on the challenge of explaining complex topics using just the 1000 most commonly used words, through this great tool. Cue a tumblr and the usual twitter back-patting, but while it’ll no doubt be forgotten in a week or so, I really like this idea.

Here’s my brief attempt:

I use computers to look at things on top of the stuff that tells a cell how it is made. Maybe the on top stuff tells the bottom stuff what to do, or maybe it’s the other way round. We don’t know that much about these things yet but we’re trying our best. I’m also seeing how the under stuff looks in real life – we sometimes like to think it’s a straight line but it’s really much prettier than that!

Pretty tricky when words like “DNA” and “gene” are off limits, let alone anything to do with chromatin and histone modifications, but a great idea and maybe it’ll help me put together some kind of layman’s explanation next time I’m asked to explain my work in simple terms.

My New Years Resolution(s)

New Years’ Resolutions: meaningless, silly and forgotten by February. With that in mind, and in no particular order, here’s mine for 2013:

1) Begin learning a functional programming language (Clojure, Haskell, ML, OCaml or other) — I’m really interested in functional programming for several reasons, some more sensible than others. Nowadays I spend a lot of time using R and the functional aspects are powerful and intuitive for mathematical programming, so a deeper understanding of the FP paradigm will likely improve my grasp of R. More importantly I expect FP to become more important as parallelisation becomes evermore vital in ‘big data’ computational biology.


Clojure anyone?

Also, as a purely academic exercise, I think lambda calculus frames an aesthetically pleasing syntax and fosters interesting programming approaches. Lastly (and perhaps least importantly) I think it could be a quirky and interesting addition to my CV, as well as making me a more “well-rounded” programmer — for all that’s worth.

2) Develop my informal science writing — While regular blogging is on the backburner for now, I still think it’s important to practice writing about science for a general audience. One way of doing this is through competitions, such as that recently run by Europe PubMed Central, which I always seem to bookmark but not get around to entering. So note to self: follow up with these this year.

3) Work on web development — I’ve never made a true web app or used javascript or PHP in anger, but I’m increasingly aware that this is something I’ll need to get to grips with sooner or later. I’ve got a testing account for the new RStudio glimmer web server which servers Shiny apps, so there’s an easy way to get started.

In line with this resolution, I also plan to tie down some actual real estate: a nice domain name and some hosting which would presumably encourage both my web development and blogging. I’m particularly interested in the new .bio general top-level domains which are due for release in 2013; presumably they’re designed for biographies but could also work well for a biologist such as myself 😉

4) There’s more to life than a PhD — keep this in mind.

Edinburgh castle – awesome picturesque scene a lot closer to the city centre than you might think.[1]

The BBC has impressed me recently with a couple of standout science-related programs that haven’t shied from some traditionally dry subject material.

Two weeks ago there was The Hidden Life of the Cell — a grandiose, narrative-driven tale of the epic struggle between the human immune system and viral invaders. It showcased some really impressive visualisations of molecular biology, as well as some interesting insight from scientists like Nick Lane* of UCL and narration by none other than Doctor Who himself!

More recently, Dara Ó Briain kicked off his new series Science Club. Dara is no stranger to hosting popular science shows, having helped out Brian Cox with Stargazing Live (didn’t see) and hosting the maths-based series  The School of Hard Sums (didn’t enjoy). Viewers of the latter program will recognise a similar format with Science Club, but this time the Irishman’s charm is lacquered upon more general topics, starting with reproduction and inheritance. It was nice to spot some different (well, lesser-known) faces and I appreciated the brave foray into epigenetics, which I expect many similar programs would avoid at 9pm on a Tuesday; nevertheless, it seemed to go down well with some:

Both of these shows are nice departures from those that typically fill the “Science & Nature” category of iPlayer — namely Countryfile and AutumnWatch — and I’d like to think they reflect a growing interest of the populous in learning real science. While I do have criticisms of both the shows I mention here, kudos to the BBC for putting out these programs (especially at primetime) and I look forward to more quality science programming.

* Good scientist, gaudy website!

Oxbridgewich and shaky tables

The Oxbridge application deadline for 2013 entry is tomorrow (!) and aspirational students can soon begin the long wait to find out if they’ll be attending the UK’s most prestigious institutions. But according to UCAS application data from 2011, Oxford and Cambridge combined received less applicants than the University of Greenwich, and this is despite a difference of around 100 places in one of the UK league tables. Greenwich even got more applicants per place, approximately 7:1 compared with Cambridge’s 5:1 and Oxford’s 6:1. So, overall the UoG must be more popular and more selective than these world-renowned institutions, should we now be referring to Oxbridgewich?

Well, probably not. I was cheating by combining Cambridge and Oxford as you can only apply for one or the other without a degree, and due to Greenwich’s relatively lower entry standards, it was likely used as a safety application for some who would go on to accept a higher-grade offer (I don’t mean to single out UoG, by the way). My point is that if seemingly obvious metrics like applications per place aren’t really indicators of an institution’s standards, what can we use instead? Well, there are several well-known league tables but each uses different methodologies; QS, for example,  relies heavily on surveying academics whereas the Guardian’s effort makes more use of the National Student Survey. Unsurprisingly they give varying results. The Telegraph’s Andrew Marzsal wrote on this subject last week, saying:

“Although nominally answering the same question, they don’t share a methodology, a data set or indeed a winner […] in fact the wildly differing outcomes of these tables make them more, not less, useful.”

University rankings: which world university rankings should we trust? (Oct 4 2012)

He justifies the latter argument by referring to expected strengths or weaknesses of the different table methods, implying if you’re interested in x, then you want table y. I wasn’t completely convinced by this, and browsing the existing tables shows remarkable year-on-year fluctuations: The above distributions show the changes in rank from 2012 to 2013 in over a hundred UK universities. The highest of these jumps was a rise of 38 places (from 82→44 for Brunel University in the Guardian’s table)—in a single year! Other big leaps include a fall of 29 places for Leeds Trinity University College, and a 30 rank rise for Birmingham City (both from the Guardian’s table). These aren’t small universities either, BCU took on over 5,000 new undergraduates in 2011. Further there’s no significant change in the analytics that produced the rankings (for the Guardian: “The methodology employed in the tables has generally remained very constant since 2008” [doc]). So how could academic standards, quality of research, job prospects or any other considered metric vary so widely in 12 months at these universities? Are the student surveys that fickle?

In a related vein, I wanted to look at the correlation of university rankings across different tables but discovered it has already been done in some detail by Sawyer, K. et al. (2012) in “Measuring the unmeasurable” (also the inconsistent naming amongst tables is probably too big an ask for my regex abilities). The authors found that while high-ranking institutions were well-correlated, those lower down the tables were not. They go on to analogise with financial markets and make somewhat fluffy generalisations about the validity of inference… but nevertheless the correlation analysis seems valid. To see if this differential treatment of lower-ranked institutions held in the 2012-13 change data, I did a simple linear regression analysis: As you can see, while the regression line itself looks an unconvincing fit, it had a significantly non-zero coefficient (0.064 ± 0.012, p = 1.18 ×10-6). The amount of variance explained by this trend would be a not-uninteresting 19%, so it does generally seem more unstable down there, or at least it is in this snapshot of the THE table. As evidence in favour of the usefulness of tables, a principal components analysis, again using world rankings [pdf], concluded that the variable with the highest explanatory power was indeed academic performance (R2 = 0.48), though this study didn’t stratify high and low-ranking educational bodies. In light of the above result, it seems likely that a subset of lower-ranked universities may have a different principle component.

Overall I’m not making an argument against the use of these tables, I know I relied on them when picking out my UCAS choices, but it seems likely that while Oxbridge may have a gentlemanly back-and-forth over the top spots for years to come, the University of Greenwich and its ilk will probably be flying all over the place, and proclamations of ‘this year’s most improved rank’ (e.g. [1], [2][3]) should be viewed with particular scepticism.

Hello, world

Having just edged up to (an admittedly meagre) 100 followers on twitter, and as I’ve just started my PhD, now seems a good time to finally start a blog. The hope is that by writing about my research and other (un)related topics, I’ll have a lasting record of what I’ve been up to which might even interest some wandering internet folk. At worst it should help improve my informal writing.

Expected topics include: recent advances in computational biology, particularly regarding higher-order chromatin and gene regulation; programming bits and pieces, focusing on R and Python; life as a PhD student; maybe even whimsical tales of life in Edinburgh—who knows!

