Import Python modules...
from __future__ import print_function
import os
import sys
import time
import re
import requests
from IPython import __version__ as ipyVersion
print("Python: %s.%s.%s" % sys.version_info[:3])
print("IPython: %s" % ipyVersion)
print()
print(time.asctime())
The URL PATH
The MW REST URL consists of three main parts, separated by forward slashes, after the common prefix specifying the invariant base URL (https://www.metabolomicsworkbench.org/rest/):
https://www.metabolomicsworkbench.org/rest/context/input_specification/output_specification
Part 1: The context determines the type of data to be accessed from the Metabolomics Workbench, such as metadata or results related to the submitted studies, data from metabolites, genes/proteins and analytical chemistry databases as well as other services related to mass spectrometry and metabolite identification:
context = study | compound | refmet | gene | protein | moverz | exactmass
Part 2: The input specification consists of two required parameters describing the REST request:
input_specification = input_item/input_value
Part 3: The output specification consists of two parameters describing the output generated by the REST request:
output_specification = output_item/(output_format)
The first parameter is required in most cases. The second parameter is optional. The input and output specifications are context sensitive. The context determines the values allowed for the remaining parameters in the input and output specifications as detailed in the sections below.
Setup MW REST base URL...
MWBaseURL = "https://www.metabolomicsworkbench.org/rest"
The “gene” context
The “gene” context refers to a Human Metabolome Gene/Protein Database (MGP) of metabolome-related genes and proteins contains data for over 7300 genes and over 15,500 proteins. It provides access to gene related information such as MGP ID, gene ID and, symbols, gene names and synonyms, alternate gene names, taxonomy ID, species, etc.
context = gene
input_item = mgp_id | gene_id | gene_name | gene_symbol | taxid
input_value = input_item_value
output_item = all | lmp_id | mgp_id | gene_name | gene_symbol | gene_synonyms | alt_names | chromosome | map_location | summary | taxid | species | species_long | mgp_id,gene_id,gene_name,...
output_format = txt | json
The “all” output item is automatically expanded to include the following items: mgp_id, gene_id, gene_name, gene_symbol, gene_synonyms, alt_names, chromosome, map_location, summary, taxid, species, species_long
Retrieve and process gene data in JSON format
Setup REST URL to retrieve all available data for a gene symbol...
MWDataURL = MWBaseURL + "/gene/gene_symbol/acaca/all"
Execute REST request using "request" module...
print("Initiating request: %s" % MWDataURL)
Response = requests.get(MWDataURL)
Check "request" status...
print("\nStatus Code: %d" % (Response.status_code))
if Response.status_code != 200:
print("Request failed: status_code: %d" % Response.status_code)
Process JSON results...
print("\nAll available gene data using gene symbol:\n")
Results = Response.json()
for ResultType in Results:
ResultValue = Results[ResultType]
print("%s: %s" % (ResultType, ResultValue))
Retrieve and process gene data in text format
Retrieve and process data for gene using Entrez gene id...
MWDataURL = MWBaseURL + "/gene/gene_id/31/all/txt"
print("Initiating request: %s" % MWDataURL)
Response = requests.get(MWDataURL)
print("\nStatus Code: %d" % (Response.status_code))
if Response.status_code != 200:
print("Request failed: status_code: %d" % Response.status_code)
print("\nAll available gene data using gene_id:\n")
Results = Response.text
for Result in Results.split("\n"):
Words = Result.split("\t")
if len(Words) != 2:
continue
ResultType, ResultValue = Result.split("\t")
print("%s: %s" % (ResultType, ResultValue))