Skip to content
This repository has been archived by the owner on Apr 5, 2018. It is now read-only.

Commit

Permalink
Merge branch 'hotfix/1.2.2'
Browse files Browse the repository at this point in the history
  • Loading branch information
denis-yuen committed May 25, 2017
2 parents d76d2ca + fd3f7b1 commit 0d25cdb
Show file tree
Hide file tree
Showing 32 changed files with 390 additions and 82 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ dist: trusty
language: node_js
node_js:
- 'stable'
- '4.2.1'
- '7.9'
services:
- postgresql
jdk:
Expand Down
Binary file added app/docs/alternate1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added app/docs/alternate2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added app/docs/alternate3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions app/docs/markdown/aws-batch-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# AWS Batch

Amazon Web Services Batch has been created to provide a simple way of running containers and simple commands on AWS without you having to closely manage the underlying EC2 infrastructure (although a knowledge of the underlying infrastructure will always be useful). While AWS Batch does not have an understanding of CWL like a full-on workflow engine, it does provide one of the simplest ways to run a large number of Dockstore tools at scale. Additionally, it provides an opportunity to run tools and manage resources almost totally from a GUI.

For this tutorial, we will assume that you've run through the AWS Batch [Getting Started](https://docs.aws.amazon.com/batch/latest/userguide/Batch_GetStarted.html) tutorial as we will mainly be focusing on things that you will need to consider when running Dockstore tools while providing a brief overview of the process.

Additionally, keep in mind that if you have a knowledge of CWL and/or do not need the Dockstore command-line to do file provisioning, you can decompose the underlying command-line invocation for the tool and use that as the command for your jobs, gaining a bit of performance. This tutorial focuses on using cwltool and using the Dockstore command-line to provide an experience that is more akin to running Dockstore or cwltool [on the command-line](/docs/launch#dockstore-cli) out of the box.

1. Unfortunately, you will need to do the most difficult step first. You will need to determine how much disk space you want to run your tool. This can vary wildly from tool to tool. For the tools in this tutorial, we went with 100 GB of space for the root disk and 100GB for the Docker volume to run our sample data, up from 8 GB and 22 GB respectively. Next, you will need to create an image or AMI with this setup. Here you have a couple of options:
1. Follow [Creating a Compute Resource AMI](https://docs.aws.amazon.com/batch/latest/userguide/create-batch-ami.html) from scratch
2. Or launch the default [ECS-Optimized AMI](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI_launch_latest.html), follow these instructions to [expand the EBS volume](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html#console-modify) and then [notify Linux](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html#recognize-expanded-volume-linux) about the increased volume size before creating an AMI. Be careful to delete that touch file mentioned in the first tutorial. In our testing, we went with this second option although both should work.

1. Create your Compute Environment, start with a managed environment and specify the instance role that you setup in the previous step. You may also want to specify a specific instance type if you want to ensure that only one tool/workflow runs on one VM at a time to conserve disk space. ![Configure compute environment](images/aws-batch-2.png)
1. When you created your compute environment, you picked or created an [IAM role](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-roles.html) for your instances (ecInstanceRole in the screenshot) If you want either your input data or output data to live on S3, add that policy to the role. ![Configure IAM role for ecsInstanceRole](images/aws-batch-1.png) Effectively, this allows programs running on your VMs access to S3 buckets to read input files and write output files. You can also use a read-only policy if you only need to read input files from S3 or create a new policy with access to only specific buckets.
1. Create a job queue. There's not too much to add here.
1. Create your job definition.
1. For your image, you will want to specify `quay.io/dockstore/batch_wrapper:1.0` or the latest tagged version [here](https://quay.io/repository/dockstore/batch_wrapper). This wrapper provides cwltool and the Dockstore CLI as well as some trivial glue and demo code. ![Job definition](images/aws-batch-3.png)
2. Specify a number of CPUs and an amount of memory that is appropriate for your job. Our understanding is that this will not actually kill jobs that float above the threshold, but it will control how many jobs can be stacked in your instances.
3. Specify volumes and mount points. Refer to the following image. `/datastore` is mounted to provide access for file provisioning. `/var/run/docker.sock/` is provided to allow cwltool to launch your desired Docker container using the Docker daemon.
![Docker mounts](images/aws-batch-4.png)
1. Create your job. Here you will specify the tool that you wish to run and the parameters that it will take.
1. For a quick test, you can try the command `/test.sh quay.io/briandoconnor/dockstore-tool-md5sum:1.0.3 https://raw.githubusercontent.com/dockstore/batch_wrapper/master/aws/md5sum.s3.json` after modifying md5sum.s3.json to point to your S3 bucket rather than dockstore.temp and uploading it somewhere accessible. This will run a quick md5sum tool that copies the result to a S3 bucket (credentials are provided via that IAM role) in just a few minutes. ![Job definition](images/aws-batch-6.png)
2. For more realistic jobs, you can try the [PCAWG project](http://icgc.org/working-pancancer-data-aws) BWA and Delly workflows which would use the commands `/test.sh quay.io/pancancer/pcawg-bwa-mem-workflow:2.6.8_1.2 https://raw.githubusercontent.com/dockstore/batch_wrapper/master/aws/bwa.s3.json
` (approximately seven hours) and `/test.sh quay.io/pancancer/pcawg_delly_workflow:2.0.1-cwl1.0 https://raw.githubusercontent.com/dockstore/batch_wrapper/master/aws/delly.local.json` (approximately six hours) respectively. In the first case, modify the S3 bucket for your environment, in the second case the results will be saved to the local VM's `/tmp` directory and will vanish after the VM is terminated.
1. Submit your job, wait for the results to show up in your S3 bucket, and celebrate. You've run jobs on AWS Batch! ![Job definition](images/aws-batch-hurray.png)
44 changes: 44 additions & 0 deletions app/docs/markdown/azure-batch-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Azure Batch

[Azure Batch](https://azure.microsoft.com/en-us/services/batch/) has been created to provide a simple way of running containers and simple commands on Azure without you having to closely manage the underlying VM infrastructure (although a knowledge of the underlying infrastructure will always be useful). While Azure Batch does not have an understanding of CWL like a full-on workflow engine, it does provide a very simple way to run a large number of Dockstore tools at scale.

Azure Batch also provides a client-side tool called [Batch Shipyard](https://github.com/Azure/batch-shipyard) which provides a number of features including a simple command-line interface for submitting batch jobs.

Of course, keep in mind that if you have a knowledge of CWL and/or do not need the Dockstore command-line to do file provisioning, you can decompose the underlying command-line invocation for the tool and use that as the command for your jobs, gaining a bit of performance. This tutorial focuses on using cwltool and using the Dockstore command-line to provide an experience that is more akin to running Dockstore or cwltool [on the command-line](/docs/launch#dockstore-cli) out of the box.

1. Run through Azure Shipyard's [Linux Installation Guide](https://github.com/Azure/batch-shipyard/blob/master/docs/01-batch-shipyard-installation.md#step-2a-linux-run-the-installsh-script) and then the [Quickstart](https://github.com/Azure/batch-shipyard/blob/master/docs/02-batch-shipyard-quickstart.md) guide with one of the sample tools such as Torch-CPU.
1. With the shipyyard CLI setup, get the md5sum sample recipes from GitHub
```
$ git clone https://github.com/dockstore/batch_wrapper.git
$ cd batch_wrapper/azure/
```
1. Fill out your `config.json`, `credentials.json`, and `jobs.json` in `config.dockstore.md5sum`. If you have trouble finding your access keys, take a look at this [article](https://docs.microsoft.com/en-us/azure/batch/batch-account-create-portal#view-batch-account-properties). In `jobs.json` note that we use AWS keys to provision or save the final output files. You will also need to modify the parameter json file `md5sum.s3.json` to reflect the location of your S3 bucket.
1. Create a compute pool. Note that this pool is not setup to automatically resize. You may also need to pick a larger VM size with a larger dataset.
```
$ ./shipyard pool add --configdir config.dockstore.md5sum
```
1. Submit the job and watch the output (this should take roughly a minute if the pool already exists)
```
$ ./shipyard jobs add --configdir config.dockstore.md5sum --tail stdout.txt
2017-05-24 14:19:21.543 INFO - Adding job dockstorejob to pool dockstore
2017-05-24 14:19:21.989 INFO - uploading file /tmp/tmp7lgz7_j7 as 'shipyardtaskrf-dockstorejob/dockertask-00012.shipyard.envlist'
2017-05-24 14:19:22.027 DEBUG - submitting 1 tasks (0 -> 0) to job dockstorejob
2017-05-24 14:19:22.090 INFO - submitted all 1 tasks to job dockstorejob
2017-05-24 14:19:22.090 DEBUG - attempting to stream file stdout.txt from job=dockstorejob task=dockertask-00012
Creating directories for run of Dockstore launcher at: ./datastore//launcher-e849c691-cc47-4bfa-a443-b8830794ae0a
Provisioning your input files to your local machine
Downloading: #input_file from https://raw.githubusercontent.com/briandoconnor/dockstore-tool-md5sum/master/md5sum.input into directory: /mnt/batch/tasks/workitems/dockstorejob/job-1/dockertask-00012/wd/./datastore/launcher-e849c691-cc47-4bfa-a443-b8830794ae0a/inputs/ce735ade-8c46-4736-a7d8-2fc0cb7d2e87
[##################################################] 100%
Calling out to cwltool to run your tool
...
Final process status is success
Saving copy of cwltool stdout to: /mnt/batch/tasks/workitems/dockstorejob/job-1/dockertask-00012/wd/./datastore/launcher-e849c691-cc47-4bfa-a443-b8830794ae0a/outputs/cwltool.stdout.txt
Saving copy of cwltool stderr to: /mnt/batch/tasks/workitems/dockstorejob/job-1/dockertask-00012/wd/./datastore/launcher-e849c691-cc47-4bfa-a443-b8830794ae0a/outputs/cwltool.stderr.txt
Provisioning your output files to their final destinations
Uploading: #output_file from /mnt/batch/tasks/workitems/dockstorejob/job-1/dockertask-00012/wd/./datastore/launcher-e849c691-cc47-4bfa-a443-b8830794ae0a/outputs/md5sum.txt to : s3://dockstore.temp/md5sum.txt
Calling on plugin io.dockstore.provision.S3Plugin$S3Provision to provision to s3://dockstore.temp/md5sum.txt
[##################################################] 100%
```
1. You can repeat the process with `config.dockstore.bwa` which is a more realistic bioinformatics workflow from the [PCAWG project](http://icgc.org/working-pancancer-data-aws) and takes roughly seven hours.
25 changes: 20 additions & 5 deletions app/docs/markdown/blog.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,39 @@
# News and Events

## May 5, 2017 - Upcoming Features

To give you a taste of what we're working on for the next major version of Dockstore, we're looking at features in the following main areas:

* Searching!
* As Dockstore grows, we've noticed that our current solution for searching tools (go to [Tools](/search-containers) or [Workflows](/search-workflows) and type in the search box) is becoming less useful. Look for more useful ways to search and filter tools and workflows in the next version
* More ways to launch tools and workflows
* We're working with partners to promote new ways to run CWL and WDL tools and workflows
* UI rewrite
* We're currently migrating our UI from AngularJS to Angular (2), watch for performance improvements and usability improvements in this area
* Write API Web Service and Client!
* With just a CWL descriptor and Dockerfile, this allows you to programmatically create GitHub and Quay.io repositories and then register and publish the tool on Dockstore in just 2 commands. Publishing tools on Dockstore has gotten a lot easier. See [GitHub](https://github.com/dockstore/write_api_service/) for more info on how to use the Write API. See [For Developers](/docs/developers#different-ways-to-register) for information on different ways to register tools on Dockstore and when to use this Write API.

As usual, we're open to suggestions. If you have one or if you spot a bug, drop us a line on [GitHub](https://github.com/ga4gh/dockstore/issues)

## April 19, 2017 - Dockstore 1.2 Release

The latest Dockstore major release includes a large number of new features and fixes.
A subset of highlighted new features follows.

### Highlighted New Features
### Highlighted New Features

* Support for private tools
* users can register tools where users will need to ask the original author for access
* Support for [private](https://dockstore.org/docs/docker_registries) Docker images hosted in GitLab and Amazon ECR
* Allow users to star tools and workflows
* Support for [private](https://dockstore.org/docs/docker_registries) Docker images hosted in GitLab and Amazon ECR
* Allow users to star tools and workflows
* Stargazers page to show all users who have starred a particular tool or workflow
* Support for [file provisioning plugins](https://github.com/ga4gh/dockstore/tree/develop/dockstore-file-plugin-parent)
* Better error messaging passed along from a newer cwltool version
* Better error messaging passed along from a newer cwltool version
* Compatibility with a Write API service for programmatically adding tools

### Breaking Changes

* The default Dockstore install no longer includes S3 support. Instead, S3 support is provided by a plugin that can be installed via `dockstore plugin download`
* The default Dockstore install no longer includes S3 support. Instead, S3 support is provided by a plugin that can be installed via `dockstore plugin download`
* The command `dockstore tool launch` used to use `--local-entry` as a flag to indicate that `--entry` was pointing at a local file. Now, it replaces `--entry`. i.e. use `dockstore tool launch --local-entry <your local file>` rather than `dockstore tool launch --local-entry --entry <your local file>`
* Update your cwltool install, details in the onboarding wizard

Expand Down
25 changes: 25 additions & 0 deletions app/docs/markdown/developers.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,3 +142,28 @@ and then:
```
client publish --tool test.json
```

## [Different Ways To Register Tools on Dockstore](#different-ways-to-register)

There are 3 major ways to register tools on Dockstore
- The Dockstore website
- The Dockstore webservice
- The Write API webservice and client

There is no clear cut answer for determining which is the best way to register tools on Dockstore. Many factors affect it. The below is merely our a suggestion, feel free to register tools on Dockstore in whichever way you prefer.

Registering many tools or very few tools?
- Very Few
- Use the Dockstore website. Just need to manually create the GitHub and Quay.io repository (if they don't exist). If you're using Quay.io as the image registry, you can simply "Refresh All Tools" on the Dockstore website. Otherwise, you can manually register the tool.
- Many
- GitHub and image registry repositories already made for each tool?
- Yes
- Are you using Quay.io for your image registry?
- Yes
- Use either the Dockstore webservice or website. Just need to refresh all tools. All of your Quay.io tools should automatically register on Dockstore.
- No
- Use the Dockstore webservice so you can programmatically register and publish all tools.
- No
- Use the Write API webservice and client. After some setup time (getting GitHub and Quay.io tokens, setting up service, etc), it allows you to programmatically create GitHub and Quay.io repositories on the fly, then register/publish them on Dockstore.

Generally, Write API webservice and client has the highest setup time compared to the other methods of registering. But, as you register more tools, the Write API tends to become the better choice (since it performs many intermediary steps for you).
Loading

0 comments on commit 0d25cdb

Please sign in to comment.