#!/usr/bin/env python # coding: utf-8 # # Reading ISA-Tab from files and Validating ISA-Tab files # ## Abstract: # # The aim of this notebook is to: # - show essential function to read and load an ISA-tab file in memory. # - navigate key objects and pull key attributes. # - learn how to invoke the ISA-tab validation function. # - interpret the output of the validation report. # # ## 1. Getting the tools # In[ ]: # If executing the notebooks on `Google Colab`,uncomment the following command # and run it to install the required python libraries. Also, make the test datasets available. # !pip install -r requirements.txt # In[ ]: import isatools import os import sys from isatools import isatab # ## 2. Reading and loading an ISA Investigation in memory from an ISA-Tab instance # In[ ]: with open(os.path.join('./BII-S-3', 'i_gilbert.txt')) as fp: ISA = isatab.load(fp) # ### Let's check the description of the first study object present in an ISA Investigation object # In[ ]: ISA.studies[0].description # ### Let's check the protocols declared in ISA the study (using a python list comprehension): # In[ ]: [protocol.description for protocol in ISA.studies[0].protocols] # ### Let's now checks the ISA Assay Measurement and Technology Types are used in this ISA Study object # In[ ]: [f'{assay.measurement_type.term} using {assay.technology_type.term}' for assay in ISA.studies[0].assays] # ### Let's now check the `ISA Study Source` Material: # In[ ]: [source.name for source in ISA.studies[0].sources] # #### Let's check what is the first `ISA Study Source property`: # In[ ]: # here, we get all the characteristics of the first Source object first_source_characteristics = ISA.studies[0].sources[0].characteristics # In[ ]: first_source_characteristics[0].category.term # #### Let's now check what is the `value` associated with that first `ISA Study Source property`: # In[ ]: first_source_characteristics[0].value.term # #### Let's now check what are all the properties associated with this first `ISA Study Source` # In[ ]: [char.category.term for char in first_source_characteristics] # #### And the corresponding values are: # In[ ]: [char.value for char in first_source_characteristics] # ## 3. Invoking the python ISA-Tab Validator # In[ ]: my_json_report_bii_i_1 = isatab.validate(open(os.path.join('./BII-I-1/', 'i_investigation.txt'))) # In[ ]: my_json_report_bii_s_3 = isatab.validate(open(os.path.join('./BII-S-3/', 'i_gilbert.txt'))) # In[ ]: my_json_report_bii_s_4 = isatab.validate(open(os.path.join('./BII-S-4/', 'i_investigation.txt'))) # In[ ]: my_json_report_bii_s_7 = isatab.validate(open(os.path.join('./BII-S-7/', 'i_matteo.txt'))) # In[ ]: my_json_report_bii_s_7 # - This `Validation Report` shows that No Error has been logged # - The rest of the report consists in warnings meant to draw the attention of the curator to elements which may be provided but which do not break the ISA syntax. # - Notice the `study group` information reported on both study and assay files. If ISA `Factor Value[]` fields are found present in the `ISA Study` or ` ISA Assay` tables, the validator will try to identify the set of unique `Factor Value` combination defining a `Study Group`. # - When no `Factor Value` are found in a ISA `Study` or `Assay` table, the value is left to its default value: -1, which means that `No Study Group` have been found. # - ISA **strongly** encourages to declare Study Group using ISA Factor Value to unambiguously identify the Independent Variables of an experiment. # # ## 4. How does a validation failure looks like ? # ### BII-S-5 contains an error located in the `i_investigation.txt` file of the submission # In[ ]: my_json_report_bii_s_5 = isatab.validate(open(os.path.join('./BII-S-5/', 'i_investigation.txt'))) # In[ ]: my_json_report_bii_s_5["errors"] # - The Validator report the Error Array is not empty and shows the root cause of the syntactic validator error. # - There is a typo in the Investigation file which affects 2 positions on the file for both Investigation and Study Object: # Publication **l**ist. vs Publication **L**ist # ## About this notebook # # - authors: philippe.rocca-serra@oerc.ox.ac.uk, massimiliano.izzo@oerc.ox.ac.uk # - license: CC-BY 4.0 # - support: isatools@googlegroups.com # - issue tracker: https://github.com/ISA-tools/isa-api/issues