View on GitHub

thirdwave

About to Pop

This is absolutely right as far as it goes, and Joe is to be applauded for this post, but the problems with scientific publication run even deeper.

Summary

Higher ed is about to go bankrupt Most academic research outside of a few top universities is never cited, non-reproducible, and should never have been funded Citation analysis shows only a tiny group of scientists actually moves any given field forward Science will increasingly go back to the future, becoming gentleman science The angel investor is the new professor, and the entrepreneur is the new grad student One model: make a few million in the startup sector and then do science unencumbered for the rest of your life Another model: find a wealthy patron willing to fund you via vehicles like Thiel Fellowships (just like Soros or Hughes Fellowships) A third model: bring down the costs of doing research with things like openpcr.com and hack on stuff in your garage after your day job The new model for pure research is github, citizen science, open source, and reproducible research, not universities

Bugs and patches, not retractions

Take a look at the issue list for a popular open source data visualization tool named d3 on github. It is accepted that even a shipping piece of production code will have many serious bugs. There is a process of constant improvement for things as mission critical as the Linux Kernel. While this should go without saying for the computer scientists in the audience, Linux sure does require in depth knowledge of algorithms, and it is by no means just bookkeeping/theoretically uninteresting code.

Contrast this situation to academia. While many give lip service to the concept of science as a continually correcting enterprise, in practice a correction (let alone a retraction) can be career damaging or career ending, especially for those pre-tenure. With the death penalty for failure, academics have every incentive to stonewall requests for materials, data, or source code. This is not limited to the biological sciences (e.g. “Even if WMO agrees, I will still not pass on the data. We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.” ).

The solution: adopt the culture of open source, where source is assumed to be fragile and bug reports are met with patches. Reject the culture of academia, where “peer reviewed” papers are assumed to be correct, while corrections and retractions carry a career penalty.

Academia is funded by a tuition bubble that is about to pop

Academics rightly bemoan Elsevier’s extortionary journal costs. What they don’t realize, however, is that Elsevier is a billion dollar remora feeding off the trillion dollar academic bubble. Publishing will be reinvented as soon as academia goes bust. With the advent of Coursera, Udacity, and Khan Academy, which offer elite higher ed content online (plus certification) for free, that will happen within the next ten years, probably within the next five.

Because there is just no way anyone will pay $250,000 and four years of their life in a down economy when they can get the same education and now a job by doing these online certifications in machine learning. The decline will be as rapid and irreversible as the fall of print media. The people reading GNZ tend to be ahead of the curve and will get more involved with non-traditional funding sources early. But others in academe will likely be in denial till the ship hits the iceberg.

Solution: replace the unsustainable cross-subsidy which uses undergraduate tuition to fund research with new business models. Put basic certification online via vehicles like Coursera, and encourage the smartest in society via institutions like YCombinator to actually ship products that work, rather than non-reproducible papers.

Most papers are not reproducible

Most academic papers are flatly non-reproducible. See here and here. Many Nature and Science papers are highly conditional studies on cell lines that don’t replicate outside of the publishing laboratory (to be charitable) or do not replicate at all. When you try to commercialize a study, that’s when you find out how much stuff doesn’t really work:

This problem is accepted to the point that the most successful venture capitalists have learned to reproduce results by independent observers before they commit to early-stage funding. Outside of orthopedics, the pharmaceutical company Bayer recently reported an example of this problem. In September 2011, Bayer published an evaluation of 67 published studies in which they failed to duplicate two-thirds of the results with their in-house experiments. As an aside, this observation is interesting as it neatly inverts the standard academic presumption that good/pure science is done within academia and bad/conflicted science is done in industry. In point of fact, to ship a product the science must be absolutely unimpeachable (or else it is obvious that the product is nonfunctional), whereas to ship a paper one must only pass the filters of 2-3 people and avoid the limelight. Just look at most of the papers in JBC or any mass-production journal for examples of the latter approach.

Solution: reject any in-depth study that cannot be regenerated via make or the equivalent from source code, templates, and data hosted on a public server. Encourage the automation and thus the reproducibility of basic laboratory processes, and if infeasible to automate, provide video documentation of experimental protocols at the standard of jove.com using inexpensive video-editing software.

Conclusion

Academia is rapidly headed for a reckoning. Scientific publication will be “solved” as a consequence, with the future looking like reproducible research hosted on github and cooperatives like biocurious.org. Most work will be done open-source, by citizen scientists, with larger projects funded by independently wealthy technologists and/or investors who see the possibility of turning a pure research finding into a scaleable product.

at

September 05, 2012