Secret Source Codes Threaten Modern Science
A human O-GlcNAc transferase (molecule) structure created as a model by a computer program.
CREDIT: Lazarus, M.B., Nam, Y., Jiang, J., Sliz, P., Walker, S. | Protein Data Bank
Modern science relies upon researchers sharing their work so that their peers can check and verify success or failure. But most scientists still don't share one crucial piece of information — the source codes of the computer programs driving much of today's scientific progress.
Such secrecy comes at a time when many researchers write their own source codes — human-readable instructions for how computer programs do their work — to run simulations and analyze experimental results. Now, a group of scientists is arguing for new standards that require newly published studies to make their source codes available. Otherwise, they say, the scientific method of peer review and reproducing experiments to verify results is basically broken.
"Far too many pieces of code critical to the reproduction, peer-review and extension of scientific results never see the light of day," said Andrew Morin, a postdoctoral fellow in the structural biology research and computing lab at Harvard University. "As computing becomes an ever larger and more important part of research in every field of science, access to the source code used to generate scientific results is going to become more and more critical."
Missing source codes mean extra headache for scientists who want to closely follow up on new studies or check for errors. Such unavailability of source codes can also lead to more bad science slipping through the cracks — unreleased and irreproducible codes played a part in a Duke University case that led to study retractions, scientist resignations and canceled clinical drug trials for lung and breast cancer in 2010.
But of the 20 most-cited science journals in 2010, only three require computer source codes to be made available upon publication. Morin and six colleagues from universities across the U.S. proposed making such policies universal in a policy forum paper that appears in today's (April 12) issue of the Journal Science (Science is one of the three top journals that require the availability of source codes).
Public funding or policy-setting agencies should throw their weight behind the idea of sharing source codes openly, researchers said. They also proposed that research institutions and universities should use open-source software licenses to allow for source-code sharing while protecting the commercial rights to possible innovation spinoffs from research.
"The encouraging thing is that all of the proposals we have made have already been implemented by various journals, funding agencies and research institutions in one form or another — so there is not a lot of innovation required," Morin told InnovationNewsDaily.
Many scientists have learned to write computer code without formal training, and so they may simply not know of the open-source software culture of sharing such codes, Morin and his colleagues said. Others may simply be embarrassed by the "ugly" code they write for their own research.
But even one-off computer code scripts written for a single study should undergo examination and peer review, Morin said. He has often ended up sharing, reusing or adapting code he had originally written with the intention of a single use.
"If I knew there was a publication requirement for my code, I probably would have done things like comment it better, kept better track of it, and generally put a bit more thought and effort into my code — which would have certainly helped me and others later on when I inevitably tried to reuse or share it, even if just with others in my own research group," Morin said.