Reproducibility with AWS – NGS2015

Leigh Sheneman, PhD student at MSU in CS. Evolution and learning with digital organisms, applying to real world organisms in the future!

http://angus.readthedocs.org/en/2015/week3/AWS-tips.html

2015-08-27_10-49-40

Start EC2, medium-sized m3.medium is fine. Log in, update and install stuff. We need the packages for the software we will run with the eel-pond protocols: https://khmer-protocols.readthedocs.org/en/v0.8.4/

Discussion about interesting package name,

apt-get -y install libncurses5-dev

Titus: text window graphics from 70s games, likely needed for samtools tview? Everyone: Ahhhhh (understanding)

We’re going to make public AMI, for times if we wanted to share and distribute to colleagues for collaboration.

Go to EC2 console.

create_image

create

AMI

Efficient way to capture OS and software, can terminate instance and keep AMI and only get charged fraction of cost (about $0.10 per month/GB) rather than keeping instance running. Snapshot is for volume of data rather than image, which is OS filesystem.

Change permissions so you are not the only owner. Since we want to make public.

public_images

lookatpublic

Takes some time to make this public. So, wait a bit before sharing AMI-ID.

Important, this image was created in the ‘N. Virginia’ region. This image is only visible in the ‘N. Virginia’ region. There are other ways to share between regions.

Class discussion about costs for hosting images and sharing images associated with publications. Who pays? If reviewers of papers will need the images, how does that work? It is easy to share data and software associated with analyses for studies. We can provide all the instructions and data and software we want. But no one has figured out a realistic and sustainable management framework for computing resources for scientific studies. Reproducibility is of concern, but there are no incentives for scientists to provide data and transparent analyses via methods like AWS AMI to demonstrate reproducibility. If this were required for publication, there would likely be more funding resources available and everyone would do this instead of a select few. Now, people can provide stuff like this, but who is really going out and checking other peoples’ data and code and software, besides reviewers and few colleagues?

Create Volume

Make sure the availability zone (e.g. us-east-1e) matches the instance. If not, pull down menu and select:

public_ami

volume

volume_avail

Then attach a new 100GB volume to instance. Log out of ssh, log back in. Run mount commands to format disk :

mount

In the above list /dev/xvda1 is system disk, we attached /dev/xvdf

See elastic cloud computing manual for Amazon Web Services: AMI, Volume, Snapshot, and Instances.

2015-08-27_10-49-52

If creating an image for someone else, you would do the above where we took an image of an OS and a snapshot of a volume.

2015-08-27_10-58-12

Now, (power pose) we will load someone else’s snapshot (it’s really our snapshot, but same idea). First, we have to Launch an AMI instance, m3.medium is fine:

createAMI

Then, create a volume from the snapshot to add to the running instance.

create_snapshot_thenvolume

The volume is available to attach:

volumes

Under “Actions”, attach volume and select the running instance (should pop up once you start typing).

Log in, then mount volume (do not format new volume because this contains the data!), and it is there!!

mount_xvdf

Creating a bucket to share files, S3

If you wanted to host files for others to download, $0.10/GB per month.

S3

Then, you can get the link for people to download:

curl -O https://s3.amazonaws.com/lisangs2015/bigwig.py
Advertisements

About Lisa Cohen

PhD student at UC Davis.
This entry was posted in Genomics Workshop, reproducibility, workshops. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s