Title: How to use a workflow execution service (WES)
Author: David L Gibbs
Created: 2019-11-16
Purpose: Introduction to using a workflow execution service, GA4GH style
Repo: https://github.com/isb-cgc/Community-Notebooks/blob/master/Notebooks/How_to_use_a_GA4GH_tool_using_WES.ipynb
Notes: Does not work on google colabs.
This notebook is designed to be a quick introduction to using a workflow execution service (WES) and is intended as a follow-up to a previous notebook on searching for tools using a tool registry service (TRS; How to find a tool using TRS here ). This notebook must be run in an environment capable of running docker. Google Colab notebooks will be exteremely difficult to use. It's advised that a Jupyter-lab environment is started using the Google Cloud Console, AI platform.
Software used:
wes-service, a client and server implementation of the GA4GH Workflow Execution Service 1.0.0 API.
https://github.com/common-workflow-language/workflow-service https://pypi.org/project/wes-service/
cwl-tool, Common Workflow Language tool description reference implementation https://github.com/common-workflow-language/cwltool
import subprocess as sp
!sudo pip install wes-service
!sudo pip install cwltool
!sudo pip install cwlref-runner
!wes-client --version
We're going to use the subprocess library to start the wes-server in the background. We submit jobs to the wes-server, which in turn runs them on a backend executor, here, cwltool.
sp.Popen( ['wes-server', '--port', '8885'] )
# check for jobs... not yet!
!wes-client --host localhost:8885 --proto http --list
Let's get some workflow test files to use...
!git clone https://github.com/common-workflow-language/workflow-service
cd workflow-service/testdata/
ls -lha
Now, let's use the TRS to search for a tool called 'md5sum'.
import requests
response = requests.get('https://dockstore.org:8443/api/ga4gh/v1/tools/', params={"name": "md5sum"})
n = (len(response.json()[0]['versions'][0])) - 2 # n was just 0 for version 1.0.0
md5sum_url = response.json()[0]['versions'][n]['url'] + '/plain-CWL/descriptor/%2FDockstore.cwl'
# we have a url to the CWL.
print(md5sum_url)
To submit a job to the wes-server, we use the wes-client, and pass the URL and a small json file that describes the input. It's surprisingly easy.
Here's the file describing the input:
cat md5sum.json
Now we'll submit the job.
!wes-client --host=localhost:8885 --proto http $md5sum_url md5sum.json
And we'll view the output...
!cat /home/jupyter/workflows/48ea8e524ae848b58bcead5eaae35052/outdir/md5sum.txt
Let's compare that result to simply running md5sum.
!md5sum md5sum.input
Confirmed!
Now, what's in that URL?
!curl https://dockstore.org/api/api/ga4gh/v1/tools/quay.io%2Fbriandoconnor%2Fdockstore-tool-md5sum/versions/master/plain-CWL/descriptor/%2FDockstore.cwl
Now, for comparison's sake, we'll execute the workflow using the command given on the github readme: https://github.com/common-workflow-language/workflow-service
!wes-client --host localhost:8885 --proto http --attachments="dockstore-tool-md5sum.cwl,md5sum.input" md5sum.cwl md5sum.cwl.json
cat /home/jupyter/workflows/0fc518dfd1fd480999315ec9499e6f69/outdir/md5sum.txt
!md5sum md5sum.input
CONFIRMED !
Using the wes-service was actually fairly easy, provided you have a nice CWL tool description!
Please let us know if you have any questions!