CDE is useful for anyone who wants others to be able to execute their Linux programs without first installing or configuring anything. Here are some use cases:
CDE can execute programs outside of their native environments, which lets you:
The above screenshot shows Google Earth running within a CDE package on a 2006 Knoppix machine. Note that it is normally not possible to run Google Earth on that machine, since its system libraries are too old. Using CDE, though, all I had to do was first install Google Earth on my 2008 Ubuntu machine, run it with CDE to create a package, move that package to my 2006 Knoppix machine, and run Google Earth from within the package.
CDE allows your colleagues to easily reproduce and build upon your computational experiments. CDE can also snapshot your experiments when you submit a paper, so that you can easily reproduce your own results and make adjustments to address reviewer critiques.
For example, say that I’m writing scientific scripts on my own machine, but my colleague wants to run (and edit) those scripts on her machine. Normally, she would need to install all the requisite software, libraries, and extensions before she can run my scripts. However, using CDE, I simply run my script once to create a package and then send that package to my colleague.
The above screenshot shows a scientific Python script running from within a CDE package on a 2006 Fedora machine. I originally wrote that script on my 2010 Ubuntu machine, packaged it up using CDE, and transferred it over to my colleague’s 2006 Fedora machine so that she can run it without installing anything.
In theory, all research results should be reproducible, since scientific integrity dictates that others should be able to reproduce your exact experiment to validate or dispute your findings. In reality, though, the vast majority of published academic papers in the computational sciences contain algorithms, statistics, and charts that others cannot validate. Studies have shown that many researchers are willing to share their code and data after publishing, but they do not want to make the tedious effort needed to ensure that others can reliably run their software. After all, they are researchers, not professional software vendors.
In the spirit of reproducible research, some researchers now upload their scripts, datasets, and/or executables online, but it is unlikely that others can run their code without first suffering through the dependency hell of installing required programs and libraries. Some even provide compilation or installation directions, but unfortunately that documentation is often not robust. Even with the noblest of intentions, it is still difficult to package up one’s research code so that others can reliably run it.
CDE facilitates reproducible research because it is, to my knowledge, the easiest way to package up all the code, data, and environmental dependencies needed to reproduce and build upon a computational experiment, with no installation or configuration hassles. Simply put, if you can run the experiment on your Linux machine, then your colleagues can run (and build upon) it on theirs.
CDE allows you to instantly deploy software from your desktop machine to a compute cluster without needing to install all the requisite libraries and other dependencies on the cluster machines. This means that you don’t need to beg the sysadmins to install what you need or ask for root access and risk trashing the cluster machines.
Similarly, CDE allows you to instantly deploy software to a blank Linux cloud VM (e.g., Amazon EC2) without needing to create (and pay to store) a custom VM image.
If you can run a Linux command on your own computer, then you can run it with CDE on a cluster or on the cloud with no required setup!
CDE allows instructors to package up all the dependencies required to compile and execute class programming assignments. This enables students to work on their own Linux machines and to submit assignments in a self-contained, easily-runnable format.
CDE packages are somewhat future proof, so your assignment code will not be as likely to break when school machines get upgraded in the near future.
For example, the above screenshot shows my computer running a CDE package sent to me by a student from Mexico who is building a project for a virtual reality class using the OpenSceneGraph 3D graphics toolkit. Without needing to install anything beforehand, I’m able to run a draft of his project in full-screen mode on my 64-bit Ubuntu 10.10 virtual machine (hosted on a Mac Pro).
You and your colleagues can easily collaborate on coding projects by checking in a CDE package into an online version control repository like GitHub.
Since all the dependencies required to compile and execute your code are located within the CDE package, new collaborators can get started right away by simply checking out the entire package from the repository. They can immediately start working without needing to install anything on their machines (e.g., compilers, linkers, run-time environments).
CDE allows you to simultaneously run multiple versions of the same software on one machine. This is useful for, say, performing experiments comparing the functionality of different versions of the same software. Since different versions often have conflicting library dependencies, it can be very difficult to have multiple versions reside on one machine. However, CDE makes this process seamless and even allows the multiple software versions to access the same on-disk data.
If you install a lightweight Linux distro (e.g., Tiny Core Linux) within a virtual machine (VM), then you can copy your CDE package into that VM and distribute the resulting VM image file. Now your colleagues who use Windows and Mac OS can execute your CDE package from the VM. Also, as long as advances in x86 remain backwards compatible, you can continue running your package for the foreseeable future.