Thursday, May 25th, 13:30-15:00, Jayhawk room
The problem of reproducibility is multifaceted - there are social and cultural obstacles as well as technical inconsistencies that make replicating and reproducing extremely difficult. In this paper, we introduce ReproZip, an open source tool to help overcome the technical difficulties involved in preserving and replicating research, applications, databases, software, and more.
ReproZip works by packing research and more at the environmental level, along with all the necessary data files, software libraries, OS system calls, and environment variables necessary to reproduce it later on. After ReproZip finishes tracing these dependencies, it's all packed into a compressed file (
.rpz), significantly smaller than a virtual machine. That file can be unpacked in ReproUnzip, the companion program that anybody can use to automatically reproduce the research, application, etc., even if it's on a different operating system. Both packing and unpacking are extremely low-barrier, making it easy for the user to introduce reproducibility to their existing workflows.
This paper will examine the current use cases of ReproZip, ranging from digital humanities to high performance computing. We'll also explore potential library use cases for ReproZip, particularly in institutional repositories, emulation, and other digital library services.