Welcome to the Semalytics demo!
This is an extended computational narrative focusing on the platform Semalytics, a semantic-based tool for analyzing hierarchical data in translational cancer research. This demo is bundled with the paper:
Semalytics: a semantic analytics platform for the exploration of distributed and heterogenous cancer data (in translational research)
Biological annotations are modeled in a new Semantic Web fashion and are connected to Wikidata for knowledge expansion. Please, note that Semalytics explores annotations that are highly scattered along hierarchical data.
In this notebook, we are going to use Semalytics for analyzing a test dataset in order to investigate gene alteration-drug interactions. In particular, we focus on the response to the Cetuximab, an epidermal growth factor inhibitor used for the treatment of several cancer types, such as the colorectal cancer. Each cancer is a complex and variable system with unique characteristics at the molecular level, which may determine drugs performance. In this demo, we match drug responses data with annotations related to a set of 4 genes:
which are known to be relevant to Cetuximab response in colorectal cancer.
In Chapter 1, we show how the platform can be used for getting basic data insights about genomic landscapes and drug responses. In particular, we use Semalytics to identify an investigation set (i.e., data trees with both genomic and pharmacological annotations).
In Chapter 2, we explore data into the investigation set. First, we get the list of variants for the genes in the panel. Then, we explore the co-occurence of genomic variants and responses to Cetuximab.
In Chapter 3, we use Semalytics for analyzing local data harnessing the extended information of Wikidata, thus gaining new analytical options on our local database. For example, we use federated queries to explore of data about drugs different from Cetuximab, which we do not store and maintain locally.
Finally, in Appendix we list computational references to figures used in the proof-of-concept (PoC) of the paper.
See the aforementioned article for further details.
import utils
import pandas as pd
from IPython.display import SVG, display
# SPARQL endpoints
# Semalytics (i.e., local data)
# 14,281,125 explicit triples
# 2,391,980 inferred triples
SEMALYTICS_ENDPOINT = 'http://semalytics:7200/repositories/annotationDB'
# Remote knowledge base
WIKIDATA_ENDPOINT = 'https://query.wikidata.org/sparql'
# genes panel
PANEL = {'BRAF','EGFR','ERBB2','KRAS'}
# investigated variants
VARIANTS = ':sequence_alteration :feature_amplification'
# enable inline plotting
%matplotlib inline
# do not truncate data in tables
pd.set_option('display.max_colwidth', -1)
# the query
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX onto: <http://www.ontotext.com/>
select (count(distinct ?node) as ?nodes)
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?node .
?node a :Bioentity ;
:has_annotation ?ann .
}
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go!
result = result_table['nodes.value'][0]
print(f'there are {result} annotated nodes in trees')
there are 3917 annotated nodes in trees
Let $\mathcal{G}$ be the set of data trees with annotations about :sequence_alteration
or :feature_amplification
for genes in the panel. We build $\mathcal{G}$ with the following query:
# Cases with annotations in the genes panel
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX onto: <http://www.ontotext.com/>
select (count(distinct ?case) as ?cases)
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?node .
?node :has_annotation ?ann .
?ann :has_reference ?ref .
?gene :has_variant ?ref.
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?ref a ?annotation_Type .
VALUES ?annotation_Type { """+VARIANTS+""" }
}
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go!
result = result_table['cases.value'][0]
print(f'there are {result} cases annotated with 1+ variant(s) in the panel (KRAS, EGFR, BRAF, HER2)')
there are 354 cases annotated with 1+ variant(s) in the panel (KRAS, EGFR, BRAF, HER2)
Let $\mathcal{D}$ be the set of data trees with annotated mice about pharmacological responses. We build $\mathcal{D}$ with the following query:
# Cases with pharmacological annotations
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX onto: <http://www.ontotext.com/>
select ?drugName (count(distinct ?case) as ?cases)
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?mouse .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :drug_response .
?drug :has_drug_response ?ref;
:name ?drugName .
}
GROUP BY ?drugName
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go!
drug,cases = result_table['drugName.value'][0],result_table['cases.value'][0]
print(f'there are {cases} cases annotated with 1+ pharmacological response(s) for the {drug}')
there are 238 cases annotated with 1+ pharmacological response(s) for the CETUXIMAB
Let $\mathcal{S} = (\mathcal{G} \cap \mathcal{D})$ be the investigation scope (i.e., data trees with both genomic and pharmacological annotations). We get it through this query:
# Cases with pharmacological and genomic annotations
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX onto: <http://www.ontotext.com/>
select (count(distinct ?case) as ?cases)
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :drug_response .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2.
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type { """+VARIANTS+""" }
}
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go!
investigation_scope = result_table['cases.value'][0]
print(f'there are {investigation_scope} cases annotated with 1+ pharmacological response(s) AND 1+ variant(s)')
there are 113 cases annotated with 1+ pharmacological response(s) AND 1+ variant(s)
In this section we analyze annotation types for cases in the investigation set. Moreover, we exploit Semalytics for matching variants against responses to Cetuximab.
With the following query, we get the variants list of non-responder cases. The column alt_p.value
represents the type of point_mutation
.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX onto: <http://www.ontotext.com/>
select distinct ?case ?geneSymbol ?type ?alt_p
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :DRCl_PD .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2 ;
:symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
?ref2 sesame:directType ?type.
OPTIONAL {?ref2 :alt_p ?alt_p }
}
ORDER BY ?case
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# filter URIs prefixes
utils.filter_prefixes(result_table)
# there you go!
result_table[['case.value', 'geneSymbol.value', 'type.value', 'alt_p.value']].fillna("")
case.value | geneSymbol.value | type.value | alt_p.value | |
---|---|---|---|---|
0 | CRC0019 | KRAS | point_mutation | G13D |
1 | CRC0021 | KRAS | point_mutation | G12C |
2 | CRC0021 | KRAS | point_mutation | G12V |
3 | CRC0024 | KRAS | point_mutation | G13C |
4 | CRC0027 | KRAS | point_mutation | G13D |
5 | CRC0028 | KRAS | point_mutation | G13D |
6 | CRC0031 | KRAS | point_mutation | G12D |
7 | CRC0053 | KRAS | point_mutation | A146T |
8 | CRC0055 | KRAS | point_mutation | G12V |
9 | CRC0058 | KRAS | point_mutation | G12V |
10 | CRC0060 | KRAS | point_mutation | G12V |
11 | CRC0063 | KRAS | point_mutation | G12V |
12 | CRC0064 | KRAS | point_mutation | G12D |
13 | CRC0067 | KRAS | point_mutation | I36M |
14 | CRC0068 | KRAS | point_mutation | G12C |
15 | CRC0070 | KRAS | point_mutation | G12D |
16 | CRC0073 | KRAS | point_mutation | G12C |
17 | CRC0077 | KRAS | point_mutation | G12D |
18 | CRC0079 | BRAF | point_mutation | V600E |
19 | CRC0080 | ERBB2 | feature_amplification | |
20 | CRC0082 | KRAS | point_mutation | G13D |
21 | CRC0085 | KRAS | point_mutation | A146T |
22 | CRC0094 | KRAS | point_mutation | G12C |
23 | CRC0105 | EGFR | feature_amplification | |
24 | CRC0106 | BRAF | point_mutation | V600E |
25 | CRC0112 | ERBB2 | feature_amplification | |
26 | CRC0118 | BRAF | point_mutation | V600E |
27 | CRC0124 | ERBB2 | point_mutation | H878Y |
28 | CRC0124 | ERBB2 | feature_amplification | |
29 | CRC0125 | KRAS | point_mutation | K117N |
... | ... | ... | ... | ... |
61 | CRC0315 | KRAS | point_mutation | G13D |
62 | CRC0323 | BRAF | point_mutation | V600E |
63 | CRC0324 | KRAS | point_mutation | G12V |
64 | CRC0348 | KRAS | point_mutation | G12D |
65 | CRC0349 | KRAS | point_mutation | G12D |
66 | CRC0382 | KRAS | point_mutation | G12C |
67 | CRC0438 | KRAS | point_mutation | Q61K |
68 | CRC0468 | KRAS | point_mutation | G12V |
69 | CRC0479 | KRAS | point_mutation | G13D |
70 | CRC0480 | BRAF | point_mutation | V600E |
71 | CRC0481 | KRAS | point_mutation | G13D |
72 | CRC0481 | EGFR | feature_amplification | |
73 | CRC0484 | BRAF | point_mutation | V600E |
74 | CRC0504 | KRAS | point_mutation | G13D |
75 | CRC0504 | ERBB2 | point_mutation | R678Q |
76 | CRC0508 | EGFR | feature_amplification | |
77 | CRC0527 | BRAF | point_mutation | K601N |
78 | CRC0527 | EGFR | feature_amplification | |
79 | CRC0528 | BRAF | point_mutation | V600E |
80 | CRC0610 | EGFR | feature_amplification | |
81 | CRC0626 | KRAS | point_mutation | A146V |
82 | CRC0714 | KRAS | point_mutation | G13D |
83 | CRC0729 | ERBB2 | feature_amplification | |
84 | CRC0753 | KRAS | point_mutation | G12V |
85 | CRC1063 | BRAF | point_mutation | K601E |
86 | CRC1063 | BRAF | point_mutation | T241M |
87 | CRC1138 | BRAF | point_mutation | K601E |
88 | CRC1169 | EGFR | feature_amplification | |
89 | CRC1182 | KRAS | point_mutation | G12A |
90 | CRC1278 | EGFR | feature_amplification |
91 rows × 4 columns
Creating basic data for further investigations about gene variant - drug matching.
# data collections
cases_per_gene = dict()
cases_per_variant = dict()
cases_per_variant_per_gene = dict()
cases_per_response = dict()
We get cases harboring 1+ variants for each gene in the panel.
Please, note that we are counting distinct cases per gene. Therefore, cases harboring multiple variants in the same gene will be counted only once.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX onto: <http://www.ontotext.com/>
select distinct ?case
from onto:disable-sameAs
where {{
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2 ;
:symbol ?geneSymbol
VALUES ?geneSymbol {{'{}'}}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type {{ :sequence_alteration :feature_amplification }}
?ref2 sesame:directType ?type.
}}"""
for gene in PANEL:
print (f'Querying {gene}')
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query.format(gene))
cases_per_gene[gene] = set(result_table['case.value'])
print ('Cases outline:')
for key in cases_per_gene:
print(f'{key}: {len(cases_per_gene[key])}')
Querying EGFR Querying BRAF Querying KRAS Querying ERBB2 Cases outline: EGFR: 29 BRAF: 13 KRAS: 70 ERBB2: 11
Then, we get cases harboring 1+ :sequence_alteration
or :feature_amplification
Again, please, note that we are counting distinct cases per gene. Therefore, cases harboring multiple variants in the same gene will be counted only once.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX onto: <http://www.ontotext.com/>
select distinct ?case
from onto:disable-sameAs
where {{
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2 ;
:symbol ?geneSymbol
VALUES ?geneSymbol {{'KRAS' 'EGFR' 'BRAF' 'ERBB2'}}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type {{ :{} }}
?ref2 sesame:directType ?type.
}}"""
for variant in ['sequence_alteration', 'feature_amplification']:
print (f'Querying {variant}')
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query.format(variant))
cases_per_variant[variant] = set(result_table['case.value'])
print ('Cases outline:')
for variant in cases_per_variant:
print(f'\t{variant} {len(cases_per_variant[variant])}')
Querying sequence_alteration Querying feature_amplification Cases outline: sequence_alteration 88 feature_amplification 33
Besides, we get cases harboring 1+ :sequence_alteration
or :feature_amplification
for each gene in the panel.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX onto: <http://www.ontotext.com/>
select distinct ?case
from onto:disable-sameAs
where {{
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2 ;
:symbol ?geneSymbol
VALUES ?geneSymbol {{'{}'}}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type {{ :{} }}
?ref2 sesame:directType ?type.
}}"""
for variant in ['sequence_alteration', 'feature_amplification']:
print (f'Querying {variant}')
cases_per_variant_per_gene[variant] = dict()
for gene in PANEL:
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query.format(gene, variant))
try:
cases_per_variant_per_gene[variant][gene] = set(result_table['case.value'])
except KeyError:
print (f'no data for {variant} - {gene}')
print ('Cases outline:')
for variant in cases_per_variant_per_gene:
print(f'{variant}')
for gene in cases_per_variant_per_gene[variant]:
print(f'\t{gene} {len(cases_per_variant_per_gene[variant][gene])}')
Querying sequence_alteration Querying feature_amplification no data for feature_amplification - BRAF no data for feature_amplification - KRAS Cases outline: sequence_alteration EGFR 4 BRAF 13 KRAS 70 ERBB2 5 feature_amplification EGFR 26 ERBB2 7
Variants summary:
Gene | All variant types | :feature_amplification | :sequence_alteration |
---|---|---|---|
Annotated | 113 | 33 | 88 |
BRAF | 13 | 0 | 13 |
EGFR | 29 | 26 | 4 |
ERBB2 | 11 | 7 | 5 |
KRAS | 70 | 0 | 70 |
Finally, we get cases per response type.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX onto: <http://www.ontotext.com/>
select distinct ?case
from onto:disable-sameAs
where {{
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref sesame:directType :{} .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2 ;
:symbol ?geneSymbol
VALUES ?geneSymbol {{'KRAS' 'EGFR' 'BRAF' 'ERBB2'}}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type {{ :sequence_alteration :feature_amplification }}
?ref2 sesame:directType ?type.
}}"""
for response in ['DRCl_OR', 'DRCl_SD', 'DRCl_PD']:
print (f'Querying {response}')
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query.format(response))
cases_per_response[response] = set(result_table['case.value'])
print ('Cases outline:')
for key in cases_per_response:
print(f'{key}: {len(cases_per_response[key])}')
Querying DRCl_OR Querying DRCl_SD Querying DRCl_PD Cases outline: DRCl_OR: 7 DRCl_SD: 26 DRCl_PD: 80
Since we are also interested in analyzing variants co-occurrences, we enumerate all possible combinations (i.e., the power set). We will use these data in the next sections.
# create variants co-occurrences list (i.e., the power set of {'BRAF','EGFR','ERBB2','KRAS'})
variants_occurrences = list(utils.powerset(PANEL))
variants_occurrences.sort(key=len)
# just combinatorics
variants_occurrences
[(), ('EGFR',), ('ERBB2',), ('BRAF',), ('KRAS',), ('EGFR', 'KRAS'), ('KRAS', 'ERBB2'), ('BRAF', 'KRAS'), ('EGFR', 'BRAF'), ('BRAF', 'ERBB2'), ('EGFR', 'ERBB2'), ('EGFR', 'BRAF', 'KRAS'), ('BRAF', 'KRAS', 'ERBB2'), ('EGFR', 'BRAF', 'ERBB2'), ('EGFR', 'KRAS', 'ERBB2'), ('EGFR', 'BRAF', 'KRAS', 'ERBB2')]
We analyze all annotated cases and we match drug information with gene variants data.
# the investigation set
tot = cases_per_gene['BRAF'] | cases_per_gene['EGFR'] | cases_per_gene['ERBB2'] | cases_per_gene['KRAS']
print(len(tot))
113
# create a new Semalytics analysis object
a = utils.Analysis(tot, cases_per_gene, cases_per_response, variants_occurrences)
# gene variants
a.variants
gene | cases | |
---|---|---|
0 | Annotated | 113 |
1 | BRAF | 13 |
2 | EGFR | 29 |
3 | ERBB2 | 11 |
4 | KRAS | 70 |
# plot variants distribution
a.plot_variants()
<matplotlib.axes._subplots.AxesSubplot at 0x7f526d4b2d30>
# responses
a.responses
response_type | cases | |
---|---|---|
0 | response | 7 |
1 | neutral | 26 |
2 | progression | 80 |
# plot responses
a.plot_responses()
Text(0, 0.5, '')
# variants vs responses
a.matching.fillna("")
genes | DRCl_PD | DRCl_SD | DRCl_OR | tot | progression | neutral | response | |
---|---|---|---|---|---|---|---|---|
0 | (EGFR,) | 10 | 14 | 5 | 29 | 0.344828 | 0.482759 | 0.172414 |
1 | (ERBB2,) | 8 | 1 | 2 | 11 | 0.727273 | 0.0909091 | 0.181818 |
2 | (BRAF,) | 13 | 0 | 0 | 13 | 1 | 0 | 0 |
3 | (KRAS,) | 56 | 14 | 0 | 70 | 0.8 | 0.2 | 0 |
4 | (EGFR, KRAS) | 3 | 3 | 0 | 6 | 0.5 | 0.5 | 0 |
5 | (KRAS, ERBB2) | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
6 | (BRAF, KRAS) | 2 | 0 | 0 | 2 | 1 | 0 | 0 |
7 | (EGFR, BRAF) | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
8 | (BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
9 | (EGFR, ERBB2) | 0 | 0 | 0 | 0 | |||
10 | (EGFR, BRAF, KRAS) | 0 | 0 | 0 | 0 | |||
11 | (BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
12 | (EGFR, BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
13 | (EGFR, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
14 | (EGFR, BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 |
# plot matching
a.plot_matching()
<matplotlib.axes._subplots.AxesSubplot at 0x7f528856ff98>
feature_amplification
only¶We analyze cases with only 1+ feature_amplification
(and with no sequence_alteration
)
# create the subset
tot = cases_per_variant['feature_amplification'] - cases_per_variant['sequence_alteration']
print(len(tot))
25
# create a new Semalytics analysis object
a = utils.Analysis(tot, cases_per_gene, cases_per_response, variants_occurrences)
# gene variants
a.variants
gene | cases | |
---|---|---|
0 | Annotated | 25 |
1 | BRAF | 0 |
2 | EGFR | 19 |
3 | ERBB2 | 6 |
4 | KRAS | 0 |
# plot variants distribution
a.plot_variants()
<matplotlib.axes._subplots.AxesSubplot at 0x7f526d13cac8>
# responses
a.responses
response_type | cases | |
---|---|---|
0 | response | 7 |
1 | neutral | 9 |
2 | progression | 9 |
# plot responses
a.plot_responses()
Text(0, 0.5, '')
# variants vs responses
a.matching.fillna("")
genes | DRCl_PD | DRCl_SD | DRCl_OR | tot | progression | neutral | response | |
---|---|---|---|---|---|---|---|---|
0 | (EGFR,) | 5 | 9 | 5 | 19 | 0.263158 | 0.473684 | 0.263158 |
1 | (ERBB2,) | 4 | 0 | 2 | 6 | 0.666667 | 0 | 0.333333 |
2 | (BRAF,) | 0 | 0 | 0 | 0 | |||
3 | (KRAS,) | 0 | 0 | 0 | 0 | |||
4 | (EGFR, KRAS) | 0 | 0 | 0 | 0 | |||
5 | (KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
6 | (BRAF, KRAS) | 0 | 0 | 0 | 0 | |||
7 | (EGFR, BRAF) | 0 | 0 | 0 | 0 | |||
8 | (BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
9 | (EGFR, ERBB2) | 0 | 0 | 0 | 0 | |||
10 | (EGFR, BRAF, KRAS) | 0 | 0 | 0 | 0 | |||
11 | (BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
12 | (EGFR, BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
13 | (EGFR, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
14 | (EGFR, BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 |
# plot matching
a.plot_matching()
<matplotlib.axes._subplots.AxesSubplot at 0x7f52541e2be0>
sequence_alteration
only¶We analyze cases with only 1+ sequence_alteration
(and with no feature_amplification
)
# create the subset
tot = cases_per_variant['sequence_alteration'] - cases_per_variant['feature_amplification']
print (len(tot))
80
# create a new Semalytics analysis object
a = utils.Analysis(tot, cases_per_gene, cases_per_response, variants_occurrences)
# gene variants
a.variants
gene | cases | |
---|---|---|
0 | Annotated | 80 |
1 | BRAF | 12 |
2 | EGFR | 3 |
3 | ERBB2 | 4 |
4 | KRAS | 65 |
# plot variants distribution
a.plot_variants()
<matplotlib.axes._subplots.AxesSubplot at 0x7f5254137128>
# responses
a.responses
response_type | cases | |
---|---|---|
0 | response | 0 |
1 | neutral | 14 |
2 | progression | 66 |
# plot responses
a.plot_responses()
Text(0, 0.5, '')
# variants vs responses
a.matching.fillna("")
genes | DRCl_PD | DRCl_SD | DRCl_OR | tot | progression | neutral | response | |
---|---|---|---|---|---|---|---|---|
0 | (EGFR,) | 1 | 2 | 0 | 3 | 0.333333 | 0.666667 | 0 |
1 | (ERBB2,) | 3 | 1 | 0 | 4 | 0.75 | 0.25 | 0 |
2 | (BRAF,) | 12 | 0 | 0 | 12 | 1 | 0 | 0 |
3 | (KRAS,) | 54 | 11 | 0 | 65 | 0.830769 | 0.169231 | 0 |
4 | (EGFR, KRAS) | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
5 | (KRAS, ERBB2) | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
6 | (BRAF, KRAS) | 2 | 0 | 0 | 2 | 1 | 0 | 0 |
7 | (EGFR, BRAF) | 0 | 0 | 0 | 0 | |||
8 | (BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
9 | (EGFR, ERBB2) | 0 | 0 | 0 | 0 | |||
10 | (EGFR, BRAF, KRAS) | 0 | 0 | 0 | 0 | |||
11 | (BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
12 | (EGFR, BRAF, ERBB2) | 0 | 0 | 0 | 0 | |||
13 | (EGFR, KRAS, ERBB2) | 0 | 0 | 0 | 0 | |||
14 | (EGFR, BRAF, KRAS, ERBB2) | 0 | 0 | 0 | 0 |
# plot matching
a.plot_matching()
<matplotlib.axes._subplots.AxesSubplot at 0x7f52540767f0>
In this section, we are going to query data with extended knowledge. The platform connects Wikidata by leveraging owl:sameAs
predicates.
The SPARQL endpoint of Semalytics is federated with the Wikidata one (https://query.wikidata.org/sparql).
See also this Web page for other Wikidata examples related to life sciences. Those queries can be also used for querying local data in Semalytics.
We get chemical compounds (Q11173
) which physically interacts (P129
), with a specific role (P2868
), with products encoded by genes in the investigation panel.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?geneSymbol ?drugLabel ?roleLabel ?gene_productLabel
where {
# Wikidata endpoint
SERVICE <https://query.wikidata.org/sparql> {
?chem p:P129 [
ps:P129 ?gene_product ;
pq:P2868 ?role ] .
?chem wdt:P31 wd:Q11173 .
?gene_product wdt:P702 ?gene .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
?chem rdfs:label ?drugLabel .
?gene_product rdfs:label ?gene_productLabel .
?role rdfs:label ?roleLabel .
}
}
#local data
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
}
order by ?geneSymbol
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go!
result_table[['geneSymbol.value', 'drugLabel.value', 'roleLabel.value', 'gene_productLabel.value']]
geneSymbol.value | drugLabel.value | roleLabel.value | gene_productLabel.value | |
---|---|---|---|---|
0 | BRAF | dabrafenib | enzyme inhibitor | B-Raf proto-oncogene, serine/threonine kinase |
1 | BRAF | regorafenib | enzyme inhibitor | B-Raf proto-oncogene, serine/threonine kinase |
2 | BRAF | sorafenib | enzyme inhibitor | B-Raf proto-oncogene, serine/threonine kinase |
3 | BRAF | vemurafenib | enzyme inhibitor | B-Raf proto-oncogene, serine/threonine kinase |
4 | EGFR | icotinib | enzyme inhibitor | Epidermal growth factor receptor |
5 | EGFR | dacomitinib | enzyme inhibitor | Epidermal growth factor receptor |
6 | EGFR | osimertinib | enzyme inhibitor | Epidermal growth factor receptor |
7 | EGFR | gefitinib | enzyme inhibitor | Epidermal growth factor receptor |
8 | EGFR | erlotinib | enzyme inhibitor | Epidermal growth factor receptor |
9 | EGFR | lapatinib | enzyme inhibitor | Epidermal growth factor receptor |
10 | EGFR | 5-chloro-N2-(4-(4-(dimethylamino)-1-piperidinyl)-2-methoxyphenyl)-N4-(2-(dimethylphosphinyl)phenyl)-2,4-pyrimidinediamine | enzyme inhibitor | Epidermal growth factor receptor |
11 | EGFR | afatinib | enzyme inhibitor | Epidermal growth factor receptor |
12 | EGFR | canertinib | enzyme inhibitor | Epidermal growth factor receptor |
13 | EGFR | neratinib | enzyme inhibitor | Epidermal growth factor receptor |
14 | EGFR | vandetanib | enzyme inhibitor | Epidermal growth factor receptor |
15 | ERBB2 | dacomitinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
16 | ERBB2 | lapatinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
17 | ERBB2 | afatinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
18 | ERBB2 | canertinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
19 | ERBB2 | mubritinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
20 | ERBB2 | neratinib | enzyme inhibitor | Erb-b2 receptor tyrosine kinase 2 |
21 | KRAS | lonafarnib | enzyme inhibitor | KRAS proto-oncogene, GTPase |
Now we get from Wikidata the chemical formula (P274) of one of those drug: the dabrafenib...
my_query = """
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT *
WHERE
{
wd:Q3011604 wdt:P274 ?chem .
}
"""
# get data
result_table = utils.query(WIKIDATA_ENDPOINT, my_query)
result_table
chem.type | chem.value | |
---|---|---|
0 | literal | C₂₃H₂₀F₃N₅O₂S₂ |
...as well as its chemical structure (P117
).
my_query = """
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT *
WHERE
{
wd:Q3011604 wdt:P117 ?struct .
}
"""
# get data
result_table = utils.query(WIKIDATA_ENDPOINT, my_query)
display(SVG(url=result_table['struct.value'][0]))
print (f'live rendering from Wikidata of {result_table["struct.value"][0]}')
live rendering from Wikidata of http://commons.wikimedia.org/wiki/Special:FilePath/Dabrafenib.svg
Finally, we get medical conditions treated (P2175), relative data source (1) and information retrieval date.
(1) "dataset containing drug indications extracted from the FDA Adverse Event Reporting System"
my_query = """
SELECT ?medical_conditionLabel ?referenceLabel ?date
WHERE
{
wd:Q3011604 p:P2175 [
ps:P2175 ?medical_condition ;
prov:wasDerivedFrom ?source
].
?source pr:P248 ?reference ;
pr:P813 ?date
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
"""
# get data
result_table = utils.query(WIKIDATA_ENDPOINT, my_query)
# there you go
result_table[['medical_conditionLabel.value', 'referenceLabel.value', 'date.value']]
medical_conditionLabel.value | referenceLabel.value | date.value | |
---|---|---|---|
0 | non-small-cell lung carcinoma | Drug Indications Extracted from FAERS | 2018-10-02T00:00:00Z |
1 | skin cancer | Drug Indications Extracted from FAERS | 2018-10-02T00:00:00Z |
2 | metastatic melanoma | Drug Indications Extracted from FAERS | 2018-10-02T00:00:00Z |
3 | melanoma | Drug Indications Extracted from FAERS | 2018-10-02T00:00:00Z |
We can also query only cases with variants mapped to Wikidata. Those are entry points for knowledge enrichment. The column alt_p.value
represents the type of point_mutation
.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
select distinct ?case ?variant ?geneSymbol ?alt_p ?annotation_Type
where {
SERVICE <https://query.wikidata.org/sparql> {
?variant wdt:P3329 ?id .
}
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :drug_response .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?variant .
?gene :has_variant ?variant.
OPTIONAL {?variant :alt_p ?alt_p }
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?variant a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# filter URIs prefixes
utils.filter_prefixes(result_table)
# there you go
result_table[['case.value', 'variant.value', 'geneSymbol.value', 'alt_p.value', 'annotation_Type.value']].fillna("")
case.value | variant.value | geneSymbol.value | alt_p.value | annotation_Type.value | |
---|---|---|---|---|---|
0 | CRC0058 | Q28371388 | KRAS | G12V | sequence_alteration |
1 | CRC0063 | Q28371388 | KRAS | G12V | sequence_alteration |
2 | CRC0309 | Q28371388 | KRAS | G12V | sequence_alteration |
3 | CRC0468 | Q28371388 | KRAS | G12V | sequence_alteration |
4 | CRC0265 | Q28371388 | KRAS | G12V | sequence_alteration |
5 | CRC0261 | Q28371388 | KRAS | G12V | sequence_alteration |
6 | CRC0187 | Q28371388 | KRAS | G12V | sequence_alteration |
7 | CRC0242 | Q28371388 | KRAS | G12V | sequence_alteration |
8 | CRC0324 | Q28371388 | KRAS | G12V | sequence_alteration |
9 | CRC0184 | Q28371388 | KRAS | G12V | sequence_alteration |
10 | CRC0165 | Q28371388 | KRAS | G12V | sequence_alteration |
11 | CRC0021 | Q28371388 | KRAS | G12V | sequence_alteration |
12 | CRC0060 | Q28371388 | KRAS | G12V | sequence_alteration |
13 | CRC0753 | Q28371388 | KRAS | G12V | sequence_alteration |
14 | CRC0055 | Q28371388 | KRAS | G12V | sequence_alteration |
15 | CRC0354 | Q29938363 | KRAS | Q61H | sequence_alteration |
16 | CRC0139 | Q29938363 | KRAS | Q61H | sequence_alteration |
17 | CRC0438 | Q29938368 | KRAS | Q61K | sequence_alteration |
18 | CRC0024 | Q32948338 | KRAS | G13C | sequence_alteration |
19 | CRC0315 | Q28371015 | KRAS | G13D | sequence_alteration |
20 | CRC0127 | Q28371015 | KRAS | G13D | sequence_alteration |
21 | CRC0071 | Q28371015 | KRAS | G13D | sequence_alteration |
22 | CRC0237 | Q28371015 | KRAS | G13D | sequence_alteration |
23 | CRC0018 | Q28371015 | KRAS | G13D | sequence_alteration |
24 | CRC0019 | Q28371015 | KRAS | G13D | sequence_alteration |
25 | CRC0479 | Q28371015 | KRAS | G13D | sequence_alteration |
26 | CRC0504 | Q28371015 | KRAS | G13D | sequence_alteration |
27 | CRC0714 | Q28371015 | KRAS | G13D | sequence_alteration |
28 | CRC0149 | Q28371015 | KRAS | G13D | sequence_alteration |
29 | CRC0082 | Q28371015 | KRAS | G13D | sequence_alteration |
... | ... | ... | ... | ... | ... |
87 | CRC0362 | Q28444964 | EGFR | feature_amplification | |
88 | CRC0327 | Q28444964 | EGFR | feature_amplification | |
89 | CRC0328 | Q28444964 | EGFR | feature_amplification | |
90 | CRC0449 | Q28444964 | EGFR | feature_amplification | |
91 | CRC0481 | Q28444964 | EGFR | feature_amplification | |
92 | CRC0527 | Q28444964 | EGFR | feature_amplification | |
93 | CRC0537 | Q28444964 | EGFR | feature_amplification | |
94 | CRC0542 | Q28444964 | EGFR | feature_amplification | |
95 | CRC1278 | Q28444964 | EGFR | feature_amplification | |
96 | CRC0480 | Q21851559 | BRAF | V600E | sequence_alteration |
97 | CRC0528 | Q21851559 | BRAF | V600E | sequence_alteration |
98 | CRC0323 | Q21851559 | BRAF | V600E | sequence_alteration |
99 | CRC0118 | Q21851559 | BRAF | V600E | sequence_alteration |
100 | CRC0106 | Q21851559 | BRAF | V600E | sequence_alteration |
101 | CRC0484 | Q21851559 | BRAF | V600E | sequence_alteration |
102 | CRC0079 | Q21851559 | BRAF | V600E | sequence_alteration |
103 | CRC0150 | Q50092868 | BRAF | G466V | sequence_alteration |
104 | CRC1138 | Q28371540 | BRAF | K601E | sequence_alteration |
105 | CRC1063 | Q28371540 | BRAF | K601E | sequence_alteration |
106 | CRC0504 | Q28370981 | ERBB2 | R678Q | sequence_alteration |
107 | CRC0126 | Q28370984 | ERBB2 | V777L | sequence_alteration |
108 | CRC0131 | Q28370984 | ERBB2 | V777L | sequence_alteration |
109 | CRC0124 | Q29938313 | ERBB2 | H878Y | sequence_alteration |
110 | CRC0124 | Q27908387 | ERBB2 | feature_amplification | |
111 | CRC0080 | Q27908387 | ERBB2 | feature_amplification | |
112 | CRC0185 | Q27908387 | ERBB2 | feature_amplification | |
113 | CRC0186 | Q27908387 | ERBB2 | feature_amplification | |
114 | CRC0112 | Q27908387 | ERBB2 | feature_amplification | |
115 | CRC0729 | Q27908387 | ERBB2 | feature_amplification | |
116 | CRC0743 | Q27908387 | ERBB2 | feature_amplification |
117 rows × 5 columns
We can use the variants occurrences annotated in the local database for querying associated positive response predictions to drugs. Moreover, we retrieve also the scientific article from where the evidence comes and the relative medical condition treated.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select distinct ?geneSymbol ?variantLabel ?treatmentLabel ?diseaseLabel ?referenceLabel
where {
SERVICE <https://query.wikidata.org/sparql> {
?variant wdt:P3329 ?id .
?variant p:P3354 [ ps:P3354 ?treatment ;
pq:P2175 ?disease ;
prov:wasDerivedFrom ?source ].
?source pr:P248 ?reference
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
?variant rdfs:label ?variantLabel .
?treatment rdfs:label ?treatmentLabel .
?disease rdfs:label ?diseaseLabel .
?reference rdfs:label ?referenceLabel
}
}
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :drug_response .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?variant .
?gene :has_variant ?variant.
OPTIONAL {?variant :alt_p ?alt_p }
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?variant a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
order by ?geneSymbol
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go
result_table[['geneSymbol.value', 'variantLabel.value', 'treatmentLabel.value', 'diseaseLabel.value', 'referenceLabel.value']]
geneSymbol.value | variantLabel.value | treatmentLabel.value | diseaseLabel.value | referenceLabel.value | |
---|---|---|---|---|---|
0 | BRAF | BRAF G466V | vemurafenib | cancer | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
1 | BRAF | BRAF K601E | vemurafenib | skin melanoma | BRAF(L597) mutations in melanoma are associated with sensitivity to MEK inhibitors |
2 | BRAF | BRAF K601E | trametinib | skin melanoma | BRAF(L597) mutations in melanoma are associated with sensitivity to MEK inhibitors |
3 | BRAF | BRAF V600E | cobimetinib fumarate | cancer | Mechanism of MEK inhibition determines efficacy in mutant KRAS- versus BRAF-driven cancers |
4 | BRAF | BRAF V600E | irinotecan / Panitumumab / vemurafenib combination therapy | cholangiocarcinoma | Complete Clinical Response of BRAF-Mutated Cholangiocarcinoma to Vemurafenib, Panitumumab, and Irinotecan |
5 | BRAF | BRAF V600E | vemurafenib | ovarian cancer | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
6 | BRAF | BRAF V600E | Dabrafenib / Trametinib combination therapy | melanoma | Combined BRAF and MEK inhibition in melanoma with BRAF V600 mutations |
7 | BRAF | BRAF V600E | vemurafenib | melanoma | Safety and efficacy of vemurafenib in BRAF(V600E) and BRAF(V600K) mutation-positive melanoma (BRIM-3): extended follow-up of a phase 3, randomised, open-label study |
8 | BRAF | BRAF V600E | Dabrafenib / Trametinib combination therapy | melanoma | Dabrafenib and trametinib, alone and in combination for BRAF-mutant metastatic melanoma. |
9 | BRAF | BRAF V600E | vemurafenib / cobimetinib fumarate combination therapy | melanoma | Combined vemurafenib and cobimetinib in BRAF-mutated melanoma. |
10 | BRAF | BRAF V600E | pictilisib | melanoma | First-in-human phase I study of pictilisib (GDC-0941), a potent pan-class I phosphatidylinositol-3-kinase (PI3K) inhibitor, in patients with advanced solid tumors. |
11 | BRAF | BRAF V600E | selumetinib / dactolisib combination therapy | melanoma | Primary cross-resistance to BRAFV600E-, MEK1/2- and PI3K/mTOR-specific inhibitors in BRAF-mutant melanoma cells counteracted by dual pathway blockade |
12 | BRAF | BRAF V600E | vemurafenib | melanoma | Inhibition of Mutated, Activated BRAF in Metastatic Melanoma |
13 | BRAF | BRAF V600E | Dabrafenib / Trametinib combination therapy | melanoma | Combined BRAF and MEK inhibition versus BRAF inhibition alone in melanoma. |
14 | BRAF | BRAF V600E | Dabrafenib / Trametinib combination therapy | melanoma | Adjuvant Dabrafenib plus Trametinib in Stage III BRAF-Mutated Melanoma. |
15 | BRAF | BRAF V600E | trametinib / vemurafenib / dabrafenib combination therapy | gastrointestinal neuroendocrine tumor | BRAFV600E Mutations in High-Grade Colorectal Neuroendocrine Tumors May Predict Responsiveness to BRAF-MEK Combination Therapy. |
16 | BRAF | BRAF V600E | Panitumumab / Trametinib combination therapy | colorectal adenocarcinoma | Combined BRAF, EGFR, and MEK Inhibition in Patients with BRAFV600E-Mutant Colorectal Cancer. |
17 | BRAF | BRAF V600E | vemurafenib | laryngeal squamous cell carcinoma | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
18 | BRAF | BRAF V600E | vemurafenib | skin melanoma | Survival in BRAF V600-mutant advanced melanoma treated with vemurafenib |
19 | BRAF | BRAF V600E | vemurafenib | skin melanoma | Improved survival with vemurafenib in melanoma with BRAF V600E mutation |
20 | BRAF | BRAF V600E | Dabrafenib / Trametinib combination therapy | skin melanoma | Improved overall survival in melanoma with combined dabrafenib and trametinib. |
21 | BRAF | BRAF V600E | Sorafenib / Panitumumab combination therapy | colorectal cancer | Wild-type BRAF is required for response to panitumumab or cetuximab in metastatic colorectal cancer. |
22 | BRAF | BRAF V600E | Capecitabine / Vemurafenib / Bevacizumab combination therapy | colorectal cancer | Antitumor activity of BRAF inhibitor vemurafenib in preclinical models of BRAF-mutant colorectal cancer. |
23 | BRAF | BRAF V600E | vemurafenib | colorectal cancer | Antitumor activity of BRAF inhibitor vemurafenib in preclinical models of BRAF-mutant colorectal cancer. |
24 | BRAF | BRAF V600E | Vemurafenib / Gefitinib / Cetuximab combination therapy | colorectal cancer | Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. |
25 | BRAF | BRAF V600E | dabrafenib | colorectal cancer | Dabrafenib in patients with melanoma, untreated brain metastases, and other solid tumours: a phase 1 dose-escalation trial. |
26 | BRAF | BRAF V600E | dactolisib / GDC-0879 combination therapy | colorectal cancer | Concomitant BRAF and PI3K/mTOR blockade is required for effective treatment of BRAF(V600E) colorectal cancer. |
27 | BRAF | BRAF V600E | PLX4720 / GDC0941combination therapy | colorectal cancer | A genetic progression model of Braf(V600E)-induced intestinal tumorigenesis reveals targets for therapeutic intervention. |
28 | BRAF | BRAF V600E | Vemurafenib / Panitumumab combination therapy | colorectal cancer | Pilot trial of combined BRAF and EGFR inhibition in BRAF-mutant metastatic colorectal cancer patients. |
29 | BRAF | BRAF V600E | vemurafenib | colorectal cancer | Phase II Pilot Study of Vemurafenib in Patients With Metastatic BRAF-Mutated Colorectal Cancer. |
... | ... | ... | ... | ... | ... |
84 | ERBB2 | ERBB2 AMPLIFICATION | ado-trastuzumab emtansine | Her2-receptor positive breast cancer | Phase II study of the antibody drug conjugate trastuzumab-DM1 for the treatment of human epidermal growth factor receptor 2 (HER2)-positive breast cancer after prior HER2-directed therapy. |
85 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | scrotum Paget's disease | Metastatic Extramammary Paget's Disease of Scrotum Responds Completely to Single Agent Trastuzumab in a Hemodialysis Patient: Case Report, Molecular Profiling and Brief Review of the Literature. |
86 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | gastric adenocarcinoma | Trastuzumab in combination with chemotherapy versus chemotherapy alone for treatment of HER2-positive advanced gastric or gastro-oesophageal junction cancer (ToGA): a phase 3, open-label, randomised controlled trial. |
87 | ERBB2 | ERBB2 AMPLIFICATION | lapatinib | gastric adenocarcinoma | Lapatinib plus paclitaxel versus paclitaxel alone in the second-line treatment of HER2-amplified advanced gastric cancer in Asian populations: TyTAN--a randomized, phase III study. |
88 | ERBB2 | ERBB2 AMPLIFICATION | Pertuzumab / Trastuzumab combination therapy | bladder carcinoma | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
89 | ERBB2 | ERBB2 AMPLIFICATION | afatinib | pancreatic adenocarcinoma | Afatinib, an Irreversible EGFR Family Inhibitor, Shows Activity Toward Pancreatic Cancer Cells, Alone and in Combination with Radiotherapy, Independent of KRAS Status. |
90 | ERBB2 | ERBB2 AMPLIFICATION | Pertuzumab / Trastuzumab combination therapy | biliary tract cancer | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
91 | ERBB2 | ERBB2 AMPLIFICATION | Trastuzumab / Lapatinib combination therapy | colorectal cancer | Dual-targeted therapy with trastuzumab and lapatinib in treatment-refractory, KRAS codon 12/13 wild-type, HER2-positive metastatic colorectal cancer (HERACLES): a proof-of-concept, multicentre, open-label, phase 2 trial. |
92 | ERBB2 | ERBB2 AMPLIFICATION | Pertuzumab / Trastuzumab combination therapy | colorectal cancer | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
93 | ERBB2 | ERBB2 AMPLIFICATION | Trastuzumab / irinotecan combination therapy | lung small cell carcinoma | Favorable response to trastuzumab plus irinotecan combination therapy in two patients with HER2-positive relapsed small-cell lung cancer. |
94 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | uterine corpus serous adenocarcinoma | Trastuzumab treatment in patients with advanced or recurrent endometrial carcinoma overexpressing HER2/neu. |
95 | ERBB2 | ERBB2 AMPLIFICATION | Pertuzumab / Trastuzumab combination therapy | pancreatic cancer | Targeted Therapy for Advanced Solid Tumors on the Basis of Molecular Profiles: Results From MyPathway, an Open-Label, Phase IIa Multiple Basket Study. |
96 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | non-small-cell lung carcinoma | Randomized phase II trial of gemcitabine-cisplatin with or without trastuzumab in HER2-positive non-small-cell lung cancer. |
97 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | non-small-cell lung carcinoma | Trastuzumab plus docetaxel in HER2/neu-positive non-small-cell lung cancer: a California Cancer Consortium screening and phase II trial. |
98 | ERBB2 | ERBB2 AMPLIFICATION | ado-trastuzumab emtansine | non-small-cell lung carcinoma | Trastuzumab emtansine is active on HER-2 overexpressing NSCLC cell lines and overcomes gefitinib resistance. |
99 | ERBB2 | ERBB2 AMPLIFICATION | dacomitinib | non-small-cell lung carcinoma | Targeting HER2 aberrations as actionable drivers in lung cancers: phase II trial of the pan-HER tyrosine kinase inhibitor dacomitinib in patients with HER2-mutant or amplified tumors |
100 | ERBB2 | ERBB2 AMPLIFICATION | trastuzumab | endometrial cancer | Phase II trial of trastuzumab in women with advanced or recurrent, HER2-positive endometrial carcinoma: a Gynecologic Oncology Group study. |
101 | KRAS | KRAS A146T | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
102 | KRAS | KRAS A146V | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
103 | KRAS | KRAS A146V | abemaciclib | non-small-cell lung carcinoma | Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. |
104 | KRAS | KRAS G12C | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
105 | KRAS | KRAS G12C | selumetinib / docetaxel trihydrate combination therapy | non-small-cell lung carcinoma | Impact of KRAS codon subtypes from a randomised phase II trial of selumetinib plus docetaxel in KRAS mutant advanced non-small-cell lung cancer |
106 | KRAS | KRAS G12D | MK-2206 | pancreatic carcinoma | First-in-man clinical trial of the oral pan-AKT inhibitor MK-2206 in patients with advanced solid tumors. |
107 | KRAS | KRAS G12D | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
108 | KRAS | KRAS G12D | selumetinib / dactolisib combination therapy | non-small-cell lung carcinoma | Effective use of PI3K and MEK inhibitors to treat mutant Kras G12D and PIK3CA H1047R murine lung cancers. |
109 | KRAS | KRAS G12V | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
110 | KRAS | KRAS G12V | palbociclib | non-small-cell lung carcinoma | A synthetic lethal interaction between K-Ras oncogenes and Cdk4 unveils a therapeutic strategy for non-small cell lung carcinoma. |
111 | KRAS | KRAS G12V | selumetinib / docetaxel trihydrate combination therapy | non-small-cell lung carcinoma | Impact of KRAS codon subtypes from a randomised phase II trial of selumetinib plus docetaxel in KRAS mutant advanced non-small-cell lung cancer |
112 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Association of KRAS p.G13D mutation with outcome in patients with chemotherapy-refractory metastatic colorectal cancer treated with cetuximab. |
113 | KRAS | KRAS G13D | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
114 rows × 5 columns
We can use the variants occurrences annotated in the local database for querying associated negative response predictions to drugs. Moreover, we retrieve also the scientific article from where the evidence comes and the relative medical condition treated.
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select distinct ?geneSymbol ?variantLabel ?treatmentLabel ?diseaseLabel ?referenceLabel
where {
SERVICE <https://query.wikidata.org/sparql> {
?variant wdt:P3329 ?id .
?variant p:P3355 [ ps:P3355 ?treatment ;
pq:P2175 ?disease ;
prov:wasDerivedFrom ?source ].
?source pr:P248 ?reference
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
?variant rdfs:label ?variantLabel .
?treatment rdfs:label ?treatmentLabel .
?disease rdfs:label ?diseaseLabel .
?reference rdfs:label ?referenceLabel
}
}
?case a :Case ;
:hasDescendant ?mouse ;
:hasDescendant ?node .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref a :drug_response .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?variant .
?gene :has_variant ?variant.
OPTIONAL {?variant :alt_p ?alt_p }
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?variant a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
order by ?geneSymbol
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go
result_table[['geneSymbol.value', 'variantLabel.value', 'treatmentLabel.value', 'diseaseLabel.value', 'referenceLabel.value']]
geneSymbol.value | variantLabel.value | treatmentLabel.value | diseaseLabel.value | referenceLabel.value | |
---|---|---|---|---|---|
0 | BRAF | BRAF V600E | vemurafenib | melanoma | Loss of NF1 in cutaneous melanoma is associated with RAS activation and MEK dependence. |
1 | BRAF | BRAF V600E | pd-0325901 | melanoma | Loss of NF1 in cutaneous melanoma is associated with RAS activation and MEK dependence. |
2 | BRAF | BRAF V600E | trametinib | melanoma | Loss of NF1 in cutaneous melanoma is associated with RAS activation and MEK dependence. |
3 | BRAF | BRAF V600E | Panitumumab | colorectal cancer | Meta-analysis of BRAF mutation as a predictive biomarker of benefit from anti-EGFR monoclonal antibody therapy for RAS wild-type metastatic colorectal cancer |
4 | BRAF | BRAF V600E | Cetuximab | colorectal cancer | Meta-analysis of BRAF mutation as a predictive biomarker of benefit from anti-EGFR monoclonal antibody therapy for RAS wild-type metastatic colorectal cancer |
5 | BRAF | BRAF V600E | Panitumumab | colorectal cancer | Wild-type BRAF is required for response to panitumumab or cetuximab in metastatic colorectal cancer. |
6 | BRAF | BRAF V600E | Cetuximab | colorectal cancer | Wild-type BRAF is required for response to panitumumab or cetuximab in metastatic colorectal cancer. |
7 | BRAF | BRAF V600E | Cetuximab | colorectal cancer | Effects of KRAS, BRAF, NRAS, and PIK3CA mutations on the efficacy of cetuximab plus chemotherapy in chemotherapy-refractory metastatic colorectal cancer: a retrospective consortium analysis. |
8 | BRAF | BRAF V600E | Oxaliplatin | colorectal cancer | Prognostic and predictive value of common mutations for treatment response and survival in patients with metastatic colorectal cancer. |
9 | BRAF | BRAF V600E | irinotecan | colorectal cancer | Prognostic and predictive value of common mutations for treatment response and survival in patients with metastatic colorectal cancer. |
10 | BRAF | BRAF V600E | dabrafenib | non-small-cell lung carcinoma | Molecular characterization of acquired resistance to the BRAF inhibitor dabrafenib in a patient with BRAF-mutant non-small-cell lung cancer. |
11 | EGFR | EGFR G465R | Panitumumab | colorectal cancer | The First-in-class Anti-EGFR Antibody Mixture Sym004 Overcomes Cetuximab Resistance Mediated by EGFR Extracellular Domain Mutations in Colorectal Cancer. |
12 | EGFR | EGFR G465R | Cetuximab | colorectal cancer | The First-in-class Anti-EGFR Antibody Mixture Sym004 Overcomes Cetuximab Resistance Mediated by EGFR Extracellular Domain Mutations in Colorectal Cancer. |
13 | EGFR | EGFR AMPLIFICATION | osimertinib | non-small-cell lung carcinoma | Amplification of EGFR Wild-Type Alleles in Non-Small Cell Lung Cancer Cells Confers Acquired Resistance to Mutation-Selective EGFR Tyrosine Kinase Inhibitors. |
14 | EGFR | EGFR AMPLIFICATION | rociletinib | non-small-cell lung carcinoma | Amplification of EGFR Wild-Type Alleles in Non-Small Cell Lung Cancer Cells Confers Acquired Resistance to Mutation-Selective EGFR Tyrosine Kinase Inhibitors. |
15 | ERBB2 | ERBB2 AMPLIFICATION | Cetuximab | colorectal cancer | A molecularly annotated platform of patient-derived xenografts ("xenopatients") identifies HER2 as an effective therapeutic target in cetuximab-resistant colorectal cancer. |
16 | ERBB2 | ERBB2 AMPLIFICATION | Panitumumab | colorectal cancer | HER2 gene copy number status may influence clinical efficacy to anti-EGFR monoclonal antibodies in metastatic colorectal cancer patients. |
17 | ERBB2 | ERBB2 AMPLIFICATION | Cetuximab | colorectal cancer | HER2 gene copy number status may influence clinical efficacy to anti-EGFR monoclonal antibodies in metastatic colorectal cancer patients. |
18 | ERBB2 | ERBB2 AMPLIFICATION | Cetuximab / capecitabine / Oxaliplatin combination therapy | colorectal cancer | HER2 in high-risk rectal cancer patients treated in EXPERT-C, a randomized phase II trial of neoadjuvant capecitabine and oxaliplatin (CAPOX) and chemoradiotherapy (CRT) with or without cetuximab. |
19 | ERBB2 | ERBB2 AMPLIFICATION | Cetuximab | colorectal cancer | HER2 Amplification and Cetuximab Efficacy in Patients With Metastatic Colorectal Cancer Harboring Wild-type RAS and BRAF. |
20 | ERBB2 | ERBB2 AMPLIFICATION | gefitinib | adenocarcinoma of the lung | Analysis of tumor specimens at the time of acquired resistance to EGFR-TKI therapy in 155 patients with EGFR-mutant lung cancers. |
21 | ERBB2 | ERBB2 AMPLIFICATION | erlotinib | adenocarcinoma of the lung | Analysis of tumor specimens at the time of acquired resistance to EGFR-TKI therapy in 155 patients with EGFR-mutant lung cancers. |
22 | KRAS | KRAS A146T | Cetuximab | colorectal cancer | Genomic and biological characterization of exon 4 KRAS mutations in human cancer |
23 | KRAS | KRAS A146T | FOLFOX-4 / Cetuximab combination therapy | colorectal cancer | FOLFOX4 Plus Cetuximab for Patients With Previously Untreated Metastatic Colorectal Cancer According to Tumor RAS and BRAF Mutation Status: Updated Analysis of the CECOG/CORE 1.2.002 Study. |
24 | KRAS | KRAS G12A | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
25 | KRAS | KRAS G12A | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
26 | KRAS | KRAS G12A | regorafenib | colorectal cancer | KRAS exon 2 mutations influence activity of regorafenib in an SW48-based disease model of colorectal cancer. |
27 | KRAS | KRAS G12A | melphalan | multiple myeloma | Reduction of serum IGF-I levels in patients affected with Monoclonal Gammopathies of undetermined significance or Multiple Myeloma. Comparison with bFGF, VEGF and K-ras gene mutation. |
28 | KRAS | KRAS G12A | melphalan | multiple myeloma | Oncogenic RAS mutations in myeloma cells selectively induce cox-2 expression, which participates in enhanced adhesion to fibronectin and chemoresistance. |
29 | KRAS | KRAS G12A | melphalan | multiple myeloma | Activation of N-ras and K-ras induced by interleukin-6 in a myeloma cell line: implications for disease progression and therapeutic response. |
... | ... | ... | ... | ... | ... |
33 | KRAS | KRAS G12A | erlotinib | adenocarcinoma of the lung | Clinical implications of KRAS mutations in lung cancer patients treated with tyrosine kinase inhibitors: an important role for mutations in minor clones |
34 | KRAS | KRAS G12C | gefitinib | colorectal cancer | The dominant role of G12C over other KRAS mutation types in the negative prediction of efficacy of epidermal growth factor receptor tyrosine kinase inhibitors in non-small cell lung cancer. |
35 | KRAS | KRAS G12C | erlotinib | colorectal cancer | The dominant role of G12C over other KRAS mutation types in the negative prediction of efficacy of epidermal growth factor receptor tyrosine kinase inhibitors in non-small cell lung cancer. |
36 | KRAS | KRAS G12C | melphalan | multiple myeloma | Oncogenic RAS mutations in myeloma cells selectively induce cox-2 expression, which participates in enhanced adhesion to fibronectin and chemoresistance. |
37 | KRAS | KRAS G12C | melphalan | multiple myeloma | Heterogeneity in therapeutic response of genetically altered myeloma cell lines to interleukin 6, dexamethasone, doxorubicin, and melphalan. |
38 | KRAS | KRAS G12C | gefitinib | lung cancer | PTEN and PIK3CA expression is associated with prolonged survival after gefitinib treatment in EGFR-mutated lung cancer patients. |
39 | KRAS | KRAS G12D | vemurafenib | melanoma | Acquired resistance and clonal evolution in melanoma during BRAF inhibitor therapy |
40 | KRAS | KRAS G12D | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
41 | KRAS | KRAS G12D | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
42 | KRAS | KRAS G12D | vemurafenib | hairy cell leukemia | Targeting Mutant BRAF in Relapsed or Refractory Hairy-Cell Leukemia |
43 | KRAS | KRAS G12D | melphalan | multiple myeloma | Oncogenic RAS mutations in myeloma cells selectively induce cox-2 expression, which participates in enhanced adhesion to fibronectin and chemoresistance. |
44 | KRAS | KRAS G12D | melphalan | multiple myeloma | Heterogeneity in therapeutic response of genetically altered myeloma cell lines to interleukin 6, dexamethasone, doxorubicin, and melphalan. |
45 | KRAS | KRAS G12D | gefitinib | lung cancer | PTEN and PIK3CA expression is associated with prolonged survival after gefitinib treatment in EGFR-mutated lung cancer patients. |
46 | KRAS | KRAS G12R | Cetuximab | colorectal cancer | Emergence of KRAS mutations and acquired resistance to anti-EGFR therapy in colorectal cancer |
47 | KRAS | KRAS G12S | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
48 | KRAS | KRAS G12S | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
49 | KRAS | KRAS G12S | melphalan | multiple myeloma | Oncogenic RAS mutations in myeloma cells selectively induce cox-2 expression, which participates in enhanced adhesion to fibronectin and chemoresistance. |
50 | KRAS | KRAS G12S | melphalan | multiple myeloma | Activation of N-ras and K-ras induced by interleukin-6 in a myeloma cell line: implications for disease progression and therapeutic response. |
51 | KRAS | KRAS G12S | melphalan | multiple myeloma | Heterogeneity in therapeutic response of genetically altered myeloma cell lines to interleukin 6, dexamethasone, doxorubicin, and melphalan. |
52 | KRAS | KRAS G12S | gefitinib | lung cancer | PTEN and PIK3CA expression is associated with prolonged survival after gefitinib treatment in EGFR-mutated lung cancer patients. |
53 | KRAS | KRAS G12V | crizotinib | cancer | Durable Response to Crizotinib in a MET-Amplified, KRAS-Mutated Carcinoma of Unknown Primary. |
54 | KRAS | KRAS G12V | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
55 | KRAS | KRAS G12V | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
56 | KRAS | KRAS G12V | Cetuximab | colorectal cancer | Association of KRAS p.G13D mutation with outcome in patients with chemotherapy-refractory metastatic colorectal cancer treated with cetuximab. |
57 | KRAS | KRAS G12V | gefitinib | lung cancer | PTEN and PIK3CA expression is associated with prolonged survival after gefitinib treatment in EGFR-mutated lung cancer patients. |
58 | KRAS | KRAS G13D | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
59 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
60 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Phase II study of single-agent cetuximab in KRAS G13D mutant metastatic colorectal cancer. |
61 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Cetuximab treatment for metastatic colorectal cancer with KRAS p.G13D mutations improves progression-free survival. |
62 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Meta-analysis comparing the efficacy of anti-EGFR monoclonal antibody therapy between KRAS G13D and other KRAS mutant metastatic colorectal cancer tumours. |
63 rows × 5 columns
We can use the connection to Wikidata for querying evidences of drug responses predictions (i.e., positive or negative) associated with variants harbored by a specific case (i.e., id=CRC0481)
Positive responses predictions
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select distinct ?geneSymbol ?variantLabel ?treatmentLabel ?diseaseLabel ?referenceLabel
where {
SERVICE <https://query.wikidata.org/sparql> {
?variant wdt:P3329 ?id .
?variant p:P3354 [ ps:P3354 ?treatment ;
pq:P2175 ?disease ;
prov:wasDerivedFrom ?source ].
?source pr:P248 ?reference
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
?variant rdfs:label ?variantLabel .
?treatment rdfs:label ?treatmentLabel .
?disease rdfs:label ?diseaseLabel .
?reference rdfs:label ?referenceLabel .
}
}
:CRC0481 :hasDescendant ?node .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?variant .
?gene :has_variant ?variant.
OPTIONAL {?variant :alt_p ?alt_p }
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?variant a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
order by ?geneSymbol
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go
result_table[['geneSymbol.value', 'variantLabel.value', 'treatmentLabel.value', 'diseaseLabel.value', 'referenceLabel.value']]
geneSymbol.value | variantLabel.value | treatmentLabel.value | diseaseLabel.value | referenceLabel.value | |
---|---|---|---|---|---|
0 | EGFR | EGFR AMPLIFICATION | Cetuximab / platinum / fluorouracil combination therapy | head and neck squamous cell carcinoma | Evaluation of EGFR gene copy number as a predictive biomarker for the efficacy of cetuximab in combination with chemotherapy in the first-line treatment of recurrent and/or metastatic squamous cell carcinoma of the head and neck: EXTREME study. |
1 | EGFR | EGFR AMPLIFICATION | Cetuximab | colorectal cancer | Clinical usefulness of EGFR gene copy number as a predictive marker in colorectal cancer patients treated with cetuximab: a fluorescent in situ hybridization study. |
2 | EGFR | EGFR AMPLIFICATION | Panitumumab | colorectal cancer | EGFR gene copy number as a predictive biomarker for resistance to anti-EGFR monoclonal antibodies in metastatic colorectal cancer treatment: a meta-analysis. |
3 | EGFR | EGFR AMPLIFICATION | Cetuximab | colorectal cancer | EGFR gene copy number as a predictive biomarker for resistance to anti-EGFR monoclonal antibodies in metastatic colorectal cancer treatment: a meta-analysis. |
4 | EGFR | EGFR AMPLIFICATION | EGFR inhibitor | non-small-cell lung carcinoma | EGFR gene copy number as a predictive biomarker for patients receiving tyrosine kinase inhibitor treatment: a systematic review and meta-analysis in non-small-cell lung cancer. |
5 | EGFR | EGFR AMPLIFICATION | gefitinib | non-small-cell lung carcinoma | Epidermal Growth Factor Receptor Gene Amplification in Patients with Advanced-stage NSCLC. |
6 | EGFR | EGFR AMPLIFICATION | erlotinib | non-small-cell lung carcinoma | Epidermal Growth Factor Receptor Gene Amplification in Patients with Advanced-stage NSCLC. |
7 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Association of KRAS p.G13D mutation with outcome in patients with chemotherapy-refractory metastatic colorectal cancer treated with cetuximab. |
8 | KRAS | KRAS G13D | selumetinib / dactolisib combination therapy | colorectal cancer | Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas. |
Negative responses predictions
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select distinct ?geneSymbol ?variantLabel ?treatmentLabel ?diseaseLabel ?referenceLabel
where {
SERVICE <https://query.wikidata.org/sparql> {
?variant wdt:P3329 ?id .
?variant p:P3355 [ ps:P3355 ?treatment ;
pq:P2175 ?disease ;
prov:wasDerivedFrom ?source ].
?source pr:P248 ?reference
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
?variant rdfs:label ?variantLabel .
?treatment rdfs:label ?treatmentLabel .
?disease rdfs:label ?diseaseLabel .
?reference rdfs:label ?referenceLabel .
}
}
:CRC0481 :hasDescendant ?node .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?variant .
?gene :has_variant ?variant.
OPTIONAL {?variant :alt_p ?alt_p }
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?variant a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
order by ?geneSymbol
"""
# get data
result_table = utils.query(SEMALYTICS_ENDPOINT, my_query)
# there you go
result_table[['geneSymbol.value', 'variantLabel.value', 'treatmentLabel.value', 'diseaseLabel.value', 'referenceLabel.value']]
geneSymbol.value | variantLabel.value | treatmentLabel.value | diseaseLabel.value | referenceLabel.value | |
---|---|---|---|---|---|
0 | EGFR | EGFR AMPLIFICATION | osimertinib | non-small-cell lung carcinoma | Amplification of EGFR Wild-Type Alleles in Non-Small Cell Lung Cancer Cells Confers Acquired Resistance to Mutation-Selective EGFR Tyrosine Kinase Inhibitors. |
1 | EGFR | EGFR AMPLIFICATION | rociletinib | non-small-cell lung carcinoma | Amplification of EGFR Wild-Type Alleles in Non-Small Cell Lung Cancer Cells Confers Acquired Resistance to Mutation-Selective EGFR Tyrosine Kinase Inhibitors. |
2 | KRAS | KRAS G13D | Panitumumab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
3 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | PIK3CA mutations in colorectal cancer are associated with clinical resistance to EGFR-targeted monoclonal antibodies. |
4 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Phase II study of single-agent cetuximab in KRAS G13D mutant metastatic colorectal cancer. |
5 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Cetuximab treatment for metastatic colorectal cancer with KRAS p.G13D mutations improves progression-free survival. |
6 | KRAS | KRAS G13D | Cetuximab | colorectal cancer | Meta-analysis comparing the efficacy of anti-EGFR monoclonal antibody therapy between KRAS G13D and other KRAS mutant metastatic colorectal cancer tumours. |
In this final section we list links between figures in the paper and queries or computations in this notebook.
The pie chart on the left (i.e., response fractions in trees with no variants) can be obtained with the following query:
# response fractions in trees with no variants
my_query = """
PREFIX : <http://las.ircc.it/ontology/annotationplatform#>
PREFIX onto: <http://www.ontotext.com/>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select (count(distinct ?case) as ?cases) ?type
from onto:disable-sameAs
where {
?case a :Case ;
:hasDescendant ?mouse .
?mouse a :Biomouse ;
:has_annotation ?ann .
?ann :has_reference ?ref .
?ref sesame:directType ?type .
filter not exists {
?case :hasDescendant ?node .
?node :has_annotation ?ann2 .
?ann2 :has_reference ?ref2 .
?gene :has_variant ?ref2.
?gene :symbol ?geneSymbol
VALUES ?geneSymbol {'KRAS' 'EGFR' 'BRAF' 'ERBB2'}
?ref2 a ?annotation_Type.
VALUES ?annotation_Type { :sequence_alteration :feature_amplification }
}
}
group by ?type
"""
# get data
result_table_no_var = utils.query(SEMALYTICS_ENDPOINT, my_query)
result_table_no_var[['type.value','cases.value']]
type.value | cases.value | |
---|---|---|
0 | http://las.ircc.it/ontology/annotationplatform#DRCl_PD | 38 |
1 | http://las.ircc.it/ontology/annotationplatform#DRCl_SD | 58 |
2 | http://las.ircc.it/ontology/annotationplatform#DRCl_OR | 29 |
results = [
int(result_table_no_var[result_table_no_var['type.value'].str.contains('_OR')]['cases.value']),
int(result_table_no_var[result_table_no_var['type.value'].str.contains('_SD')]['cases.value']),
int(result_table_no_var[result_table_no_var['type.value'].str.contains('_PD')]['cases.value'])
]
d = {'response_type': ['response', 'neutral', 'progression'], 'cases': results}
df = pd.DataFrame(data=d)
chart = df.plot.pie(y = 'cases',
rot=0,
#labels = df['response_type'], # labels
labels = None,
legend = False,
figsize=(5, 5),
colors=utils.response_colors(df['response_type']),
title ='No variants' # title
).set_ylabel('')
The data matrix in Figure 5a cotains several charts.
Row Mutations only: the three charts of the first row are the ones computed in cells of section Matching sequence_alteration
only
Row Amplifications only: charts of this row are computed in cells of section Matching feature_amplification
only
Row All cases: these charts are generated in cells of section Matching annotations in the investigation set
This figure shows the distribution of variants detected in cases that did not respond to Cetuximab. In particular, data coming from the query presented in section Variants of non-responders are sliced and diced in a pivot table to present distributions about: