UC Davis Docker workshop, live-coding

For the second day of the UC Davis Docker workshop, Dr. Titus Brown initiates a “teach-me” session on how to make a Docker container. He will live-code and we will tell him what to write. The idea is to package software he knows well, e.g. khmer into a Docker container, run the Dockerfile successfully, and upload to Docker hub.

This really highlights the beauty of Docker. One of the most time-consuming aspects of open-source software is installing and getting it to work. There are many dependencies and sometimes version updates with dependencies can conflict with dependencies of other software. Docker containers install dependencies in one fell swoop and run the software in isolation from other software with separate dependencies.

Hackpad

Titus’ github repo

Start EC2 instance, m3.xlarge, install and run Docker (don’t forget to log out then log back in again)

While editing Dockerfile, have two windows with the same instance open: 1.) Docker is running, 2.) instance shell to figure stuff out

Tricky part is figuring out what dependencies are required for install, can use ENV PACKAGES and ENV VERSION. Fill in {PACKAGES} and {VERSION}, see if they work, then will have completed Dockerfile for running.

Then, write script build_test.sh to run with set of test data to see if Dockerfile works.

git clone https://github.com/ctb/2015-docker-building.git
cd 2015-docker-building/khmer/
bash run_test.sh

Works, so cool!!

khmer_docker

Now, Figure out how to upload to Docker hub.

http://docs.docker.com/engine/userguide/dockerrepos/

docker tag diblab/khmer:2.0 diblab/khmer:latest

Puts the image here:

https://hub.docker.com/r/diblab/khmer/

Now, anyone can run:

docker pull diblab/khmer

Yay!

This took 1.5 hrs.

Also, salmon and dammit.

dammit has database files, which are big-ish (~GB): options for these:

  1. include db files in image (BAD because image is big and not shared between containers)
  2. download each time (BAD because slow and not shared between containers, goal with large dataset is that we want to share between containers)
  3. *** want to do this *** create data volume within Docker, an image that is just detachable, configured disk space, download files once and then share forever (downside is accessible only in docker files system)
  4. local disk (a. download, b. mount each time, BAD because local file system is required, not independent and sometimes doesn’t work)
Advertisements

About Lisa Cohen

PhD student at UC Davis.
This entry was posted in workshops. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s