#!/usr/bin/env python
# coding: utf-8
# # Reading ISA-Tab from files and Validating ISA-Tab files
# ## Abstract:
#
# The aim of this notebook is to:
# - show essential function to read and load an ISA-tab file in memory.
# - navigate key objects and pull key attributes.
# - learn how to invoke the ISA-tab validation function.
# - interpret the output of the validation report.
#
# ## 1. Getting the tools
# In[ ]:
# If executing the notebooks on `Google Colab`,uncomment the following command
# and run it to install the required python libraries. Also, make the test datasets available.
# !pip install -r requirements.txt
# In[ ]:
import isatools
import os
import sys
from isatools import isatab
# ## 2. Reading and loading an ISA Investigation in memory from an ISA-Tab instance
# In[ ]:
with open(os.path.join('./BII-S-3', 'i_gilbert.txt')) as fp:
ISA = isatab.load(fp)
# ### Let's check the description of the first study object present in an ISA Investigation object
# In[ ]:
ISA.studies[0].description
# ### Let's check the protocols declared in ISA the study (using a python list comprehension):
# In[ ]:
[protocol.description for protocol in ISA.studies[0].protocols]
# ### Let's now checks the ISA Assay Measurement and Technology Types are used in this ISA Study object
# In[ ]:
[f'{assay.measurement_type.term} using {assay.technology_type.term}' for assay in ISA.studies[0].assays]
# ### Let's now check the `ISA Study Source` Material:
# In[ ]:
[source.name for source in ISA.studies[0].sources]
# #### Let's check what is the first `ISA Study Source property`:
# In[ ]:
# here, we get all the characteristics of the first Source object
first_source_characteristics = ISA.studies[0].sources[0].characteristics
# In[ ]:
first_source_characteristics[0].category.term
# #### Let's now check what is the `value` associated with that first `ISA Study Source property`:
# In[ ]:
first_source_characteristics[0].value.term
# #### Let's now check what are all the properties associated with this first `ISA Study Source`
# In[ ]:
[char.category.term for char in first_source_characteristics]
# #### And the corresponding values are:
# In[ ]:
[char.value for char in first_source_characteristics]
# ## 3. Invoking the python ISA-Tab Validator
# In[ ]:
my_json_report_bii_i_1 = isatab.validate(open(os.path.join('./BII-I-1/', 'i_investigation.txt')))
# In[ ]:
my_json_report_bii_s_3 = isatab.validate(open(os.path.join('./BII-S-3/', 'i_gilbert.txt')))
# In[ ]:
my_json_report_bii_s_4 = isatab.validate(open(os.path.join('./BII-S-4/', 'i_investigation.txt')))
# In[ ]:
my_json_report_bii_s_7 = isatab.validate(open(os.path.join('./BII-S-7/', 'i_matteo.txt')))
# In[ ]:
my_json_report_bii_s_7
# - This `Validation Report` shows that No Error has been logged
# - The rest of the report consists in warnings meant to draw the attention of the curator to elements which may be provided but which do not break the ISA syntax.
# - Notice the `study group` information reported on both study and assay files. If ISA `Factor Value[]` fields are found present in the `ISA Study` or ` ISA Assay` tables, the validator will try to identify the set of unique `Factor Value` combination defining a `Study Group`.
# - When no `Factor Value` are found in a ISA `Study` or `Assay` table, the value is left to its default value: -1, which means that `No Study Group` have been found.
# - ISA **strongly** encourages to declare Study Group using ISA Factor Value to unambiguously identify the Independent Variables of an experiment.
#
# ## 4. How does a validation failure looks like ?
# ### BII-S-5 contains an error located in the `i_investigation.txt` file of the submission
# In[ ]:
my_json_report_bii_s_5 = isatab.validate(open(os.path.join('./BII-S-5/', 'i_investigation.txt')))
# In[ ]:
my_json_report_bii_s_5["errors"]
# - The Validator report the Error Array is not empty and shows the root cause of the syntactic validator error.
# - There is a typo in the Investigation file which affects 2 positions on the file for both Investigation and Study Object:
# Publication **l**ist. vs Publication **L**ist
# ## About this notebook
#
# - authors: philippe.rocca-serra@oerc.ox.ac.uk, massimiliano.izzo@oerc.ox.ac.uk
# - license: CC-BY 4.0
# - support: isatools@googlegroups.com
# - issue tracker: https://github.com/ISA-tools/isa-api/issues