Reproducible research in statistics/data science

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.

Buckheit and Donoho (1995)

Non-reproducible research

Microarray studies

Nature Genetics (2015 Impact Factor: 31.616). 20 articles about microarray profiling published in Nature Genetics between Jan 2005 and Dec 2006.

Bible code

  • Witztum, Rips, and Rosenberg (1994)

  • McKay et al. (1999)

Why reproducible research

How to be reproducible in data science?

When we publish articles containing figures which were generated by computer, we also publish the complete software environment which generates the figures.

Buckheit and Donoho (1995)

Tools for reproducible research

References

Baggerly, Keith A., and Kevin R. Coombes. 2009. “Deriving Chemosensitivity from Cell Lines: Forensic Bioinformatics and Reproducible Research in High-Throughput Biology.” Ann. Appl. Stat. 3 (4): 1309–34. https://doi.org/10.1214/09-AOAS291.
Buckheit, JonathanB., and DavidL. Donoho. 1995. “WaveLab and Reproducible Research.” In Wavelets and Statistics, edited by Anestis Antoniadis and Georges Oppenheim, 103:55–81. Lecture Notes in Statistics. Springer New York. https://doi.org/10.1007/978-1-4612-2544-7_5.
McKay, Brendan, Dror Bar-Natan, Maya Bar-Hillel, and Gil Kalai. 1999. “Solving the Bible Code Puzzle.” Statist. Sci. 14 (2): 150–73. https://doi.org/10.1214/ss/1009212243.
Potti, Anil, Holly K. Dressman, Andrea Bild, and Richard F. Riedel. 2006. Genomic signatures to guide the use of chemotherapeutics. Nature Medicine 12 (11): 1294–1300. https://doi.org/10.1038/nm1491.
Witztum, Doron, Eliyahu Rips, and Yoav Rosenberg. 1994. “Equidistant Letter Sequences in the Book of Genesis.” Statist. Sci. 9 (3): 429–38. https://doi.org/10.1214/ss/1177010393.