Datamart
REST API.¶This Jupyter notebook requires at least Python 3.3 with these packages installed:
pip install notebook
pip install requests
pip install pandas
To run change to the directory containing this notebook, and type
jupyter notebook
Then, open this page in the web browser: http://localhost:8888/notebooks/Datamart%20Data%20API%20Demo.ipynb
## set datamart api url
# The datamart server running at ISI
# datamart_api_url = 'https://datamart:datamart-api-789@dsbox02.isi.edu:8888/datamart-api'
# Datamart server running on localhost
# datamart_api_url = 'http://localhost:14080'
# Datamart server running on localhost in development mode
datamart_api_url = 'http://localhost:5000'
from requests import get,post,put,delete
import json
import pandas as pd
from io import StringIO
from IPython.display import display, HTML
GET /metadata/datasets
response = get(f'{datamart_api_url}/metadata/datasets')
print(json.dumps(response.json(), indent=2))
[ { "name": "FSI dataset", "description": "data downloaded from FSI", "url": "https://fragilestatesindex.org", "dataset_id": "FSI" }, { "name": "OECD dataset", "description": "data downloaded from OECD", "url": "https://data.oecd.org", "dataset_id": "OECD" }, { "name": "UAZ Indicators", "description": "Collection of indicators, including indicators from FAO, WDI, FEWSNET, CLiMIS, UNICEF, ieconomics.com, UNHCR, DSSAT, WHO, IMF, WHP, ACLDE, World Bank and IOM-DTM", "url": "https://github.com/ml4ai/delphi", "dataset_id": "UAZ" }, { "name": "WGI dataset", "description": "Worldwide Governance Indicators", "url": "https://databank.worldbank.org/source/worldwide-governance-indicators", "dataset_id": "WGI" }, { "name": "WDI dataset", "description": "World Development Indicators", "url": "https://databank.worldbank.org/source/world-development-indicators", "dataset_id": "WDI" }, { "name": "Corruption Perceptions Index", "description": "Transparency International Corruption Perceptions Index The CPI scores and ranks countries/territories based on how corrupt a country\u2019s public sector is perceived to be by experts and business executives. It is a composite index, a combination of 13 surveys and assessments of corruption, collected by a variety of reputable institutions. The CPI is the most widely used indicator of corruption worldwide.", "url": "https://www.transparency.org/", "dataset_id": "TICPI" }, { "name": "SIPRI Military Expenditure", "description": "Military expenditure by country, in millions of US$ at current prices and exchange rates, 1949-2018 - SIPRI 2019", "url": "https://sipri.org/databases/milex", "dataset_id": "SIPRI" }, { "name": "economic fitness dataset", "description": "EconomicFitness", "url": "https://databank.banquemondiale.org/source/economic-fitness", "dataset_id": "EconomicFitness" }, { "name": "Agricultural Market Information System (AMIS)", "description": "The Agricultural Market Information System (AMIS) is an inter-agency platform to enhance food market transparency and policy response for food security. It was launched in 2011 by the G20 Ministers of Agriculture following the global food price hikes in 2007/08 and 2010. Bringing together the principal trading countries of agricultural commodities, AMIS assesses global food supplies (focusing on wheat, maize, rice and soybeans) and provides a platform to coordinate policy action in times of market uncertainty.", "url": "http://www.amis-outlook.org", "dataset_id": "AMIS" }, { "name": "test test test", "description": "testy test", "url": "https://test.com", "dataset_id": "TEST000" }, { "name": "World Press Freedom Index", "description": "Published every year since 2002 by Reporters Without Borders (RSF), the World Press Freedom Index is an important advocacy tool based on the principle of emulation between states. The Index ranks 180 countries and regions\u00a0according to the level of freedom available to journalists.", "url": "https://rsf.org/en", "dataset_id": "WPFI" }, { "name": "Poverty Rate Global DP", "description": "Poverty Rate Global DP", "url": "http://url", "dataset_id": "DPPoverty" } ]
As of June 25, 2020 there are 11 datasets in the database. More datasets will be added as they are processed.
We can also get metadata about one dataset using the dataset_id
.
GET /metadata/datasets/{dataset_id}
response = get(f'{datamart_api_url}/metadata/datasets/WDI')
print(json.dumps(response.json(), indent=2))
[ { "name": "WDI dataset", "description": "World Development Indicators", "url": "https://databank.worldbank.org/source/world-development-indicators", "dataset_id": "WDI" } ]
GET /metadata/datasets/{dataset_id}/variables
response = get(f'{datamart_api_url}/metadata/datasets/WDI/variables')
print(json.dumps(response.json()[:4], indent=2)) # printing only 4
[ { "name": "_2005 PPP conversion factor, GDP (LCU per international $)", "variable_id": "_2005_ppp_conversion_factor_gdp_lcu_per_international", "dataset_id": "WDI" }, { "name": "_2005 PPP conversion factor, private consumption (LCU per international $)", "variable_id": "_2005_ppp_conversion_factor_private_consumption_lcu_per_international", "dataset_id": "WDI" }, { "name": "Access to clean fuels and technologies for cooking (% of population)", "variable_id": "access_to_clean_fuels_and_technologies_for_cooking_of_population", "dataset_id": "WDI" }, { "name": "Access to electricity (% of population)", "variable_id": "access_to_electricity_of_population", "dataset_id": "WDI" } ]
print('Total number of variables in dataset: {} is {}'.format('WDI', len(response.json())))
Total number of variables in dataset: WDI is 1429
GET /metadata/datasets/{dataset_id}/variables/{variable_id}
response = get(f'{datamart_api_url}/metadata/datasets/WDI/variables/access_to_electricity_of_population')
print(json.dumps(response.json(), indent=2))
{ "name": "Access to electricity (% of population)", "variable_id": "access_to_electricity_of_population", "dataset_id": "WDI", "description": "Access to electricity (% of population) in WDI", "corresponds_to_property": "PWDI-005", "qualifier": [ { "identifier": "P585", "name": "point in time" }, { "identifier": "P248", "name": "stated in" } ] }
GET /metadata/variables?keyword={keyword}
Query for datasets related to: road
response = get(f'{datamart_api_url}/metadata/variables?keyword=road')
print(json.dumps(response.json(), indent=2))
[ { "variable_id": "road_fatalities", "name": " Road Fatalities", "rank": 0.0759909, "dataset_id": "OECD" }, { "variable_id": "mortality_caused_by_road_traffic_injury_per_100_000_people", "name": " Mortality caused by road traffic injury (per 100,000 people)", "rank": 0.0759909, "dataset_id": "WDI" }, { "variable_id": "VUAZ-8054", "name": " WDI: Mortality caused by road traffic injury[per 100,000 people]", "rank": 0.0607927, "dataset_id": "UAZ" } ]
Query datasets related to: road AND fatalities
response = get(f'{datamart_api_url}/metadata/variables?keyword=road fatalities')
print(json.dumps(response.json(), indent=2))
[ { "variable_id": "road_fatalities", "name": " Road Fatalities", "rank": 0.334428, "dataset_id": "OECD" } ]
Query datasets related to: road OR fatalities
response = get(f'{datamart_api_url}/metadata/variables?keyword=road&keyword=fatalities')
print(json.dumps(response.json(), indent=2))
[ { "variable_id": "road_fatalities", "name": " Road Fatalities", "rank": 0.0759909, "dataset_id": "OECD" }, { "variable_id": "mortality_caused_by_road_traffic_injury_per_100_000_people", "name": " Mortality caused by road traffic injury (per 100,000 people)", "rank": 0.0379954, "dataset_id": "WDI" }, { "variable_id": "VUAZ-8054", "name": " WDI: Mortality caused by road traffic injury[per 100,000 people]", "rank": 0.0303964, "dataset_id": "UAZ" }, { "variable_id": "VUAZ-8136", "name": " Conflict fatalities[number of cases]", "rank": 0.0303964, "dataset_id": "UAZ" } ]
GET /datasets/{dataset_id}/variables/{variable_id}
response = get(f'{datamart_api_url}/datasets/WDI/variables/access_to_electricity_of_population')
df = pd.read_csv(StringIO(response.text))
display(HTML(df.fillna('').head(20).to_html(index=False)))
dataset_id | variable_id | variable | main_subject | main_subject_id | value | value_unit | time | time_precision | country | coordinate | stated_in | stated_in_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 73.600000 | 2000-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 76.344460 | 2001-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 77.307663 | 2002-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 78.251656 | 2003-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 79.171516 | 2004-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.600000 | 2005-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 80.943794 | 2006-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.820259 | 2007-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 82.708366 | 2008-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 83.621689 | 2009-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.389572 | 2010-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 90.631691 | 2011-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.300000 | 2012-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 86.400000 | 2013-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 88.803612 | 2014-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.926506 | 2015-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 91.058128 | 2016-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.191200 | 2017-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | The Gambia | Q1005 | 17.700000 | 1993-01-01T00:00:00Z | year | The Gambia | POINT(-15.5, 13.5) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | The Gambia | Q1005 | 18.708818 | 1994-01-01T00:00:00Z | year | The Gambia | POINT(-15.5, 13.5) | WDI | Q8035640 |
GET /datasets/{dataset_id}/variables/{variable_id}?country={country}
Get data for Gabon
response = get(f'{datamart_api_url}/datasets/WDI/variables/access_to_electricity_of_population?country=Gabon')
df = pd.read_csv(StringIO(response.text))
display(HTML(df.fillna('').to_html(index=False)))
dataset_id | variable_id | variable | main_subject | main_subject_id | value | value_unit | time | time_precision | country | coordinate | stated_in | stated_in_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 73.600000 | 2000-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 76.344460 | 2001-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 77.307663 | 2002-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 78.251656 | 2003-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 79.171516 | 2004-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.600000 | 2005-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 80.943794 | 2006-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.820259 | 2007-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 82.708366 | 2008-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 83.621689 | 2009-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.389572 | 2010-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 90.631691 | 2011-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.300000 | 2012-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 86.400000 | 2013-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 88.803612 | 2014-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.926506 | 2015-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 91.058128 | 2016-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.191200 | 2017-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 |
Get data for Gabon OR Guinea
response = get(f'{datamart_api_url}/datasets/WDI/variables/access_to_electricity_of_population?country=Gabon&country=Guinea')
df = pd.read_csv(StringIO(response.text))
display(HTML(df.fillna('').to_html(index=False)))
dataset_id | variable_id | variable | main_subject | main_subject_id | value | value_unit | time | time_precision | country | coordinate | stated_in | stated_in_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 73.600000 | 2000-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 76.344460 | 2001-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 77.307663 | 2002-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 78.251656 | 2003-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 79.171516 | 2004-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.600000 | 2005-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 80.943794 | 2006-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 81.820259 | 2007-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 82.708366 | 2008-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 83.621689 | 2009-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.389572 | 2010-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 90.631691 | 2011-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.300000 | 2012-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 86.400000 | 2013-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 88.803612 | 2014-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 89.926506 | 2015-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 91.058128 | 2016-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Gabon | Q1000 | 92.191200 | 2017-01-01T00:00:00Z | year | Gabon | POINT(11.5, -0.68333055555556) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 16.400000 | 1999-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 16.503561 | 2000-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 17.478863 | 2001-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 18.439884 | 2002-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 19.381701 | 2003-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 20.299383 | 2004-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 20.200000 | 2005-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 22.067295 | 2006-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 22.941578 | 2007-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 23.827507 | 2008-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 24.738651 | 2009-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 25.688576 | 2010-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 26.687136 | 2011-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 26.200000 | 2012-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 28.806410 | 2013-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 29.909674 | 2014-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 31.618164 | 2015-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 33.500000 | 2016-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 | |
WDI | access_to_electricity_of_population | Access to electricity (% of population) | Guinea | Q1006 | 35.441216 | 2017-01-01T00:00:00Z | year | Guinea | POINT(-11.0, 10.0) | WDI | Q8035640 |
NOTE: If the following POST methods have already been ran against the Datamart server, then server will respond with error messages.
POST /metadata/datasets
# Define a new dataset
test_dataset = {
"name": "Test Dataset 01",
"dataset_id": "TEST01",
"description": "Test Dataset 01",
"url": "http://test01.com/test"
}
# post it to the API
td_response = post(f'{datamart_api_url}/metadata/datasets', json=test_dataset)
print(json.dumps(td_response.json(), indent=2))
{ "name": "Test Dataset 01", "description": "Test Dataset 01", "url": "http://test01.com/test", "dataset_id": "TEST01" }
NOTE: If the above POST method has already been ran against this Datamart server, then server will respond with:
{
"Error": "Dataset identifier TEST01 has already been used"
}
Retrieve all datasets
response = get(f'{datamart_api_url}/metadata/datasets')
print(json.dumps(response.json(), indent=2))
[ { "name": "FSI dataset", "description": "data downloaded from FSI", "url": "https://fragilestatesindex.org", "dataset_id": "FSI" }, { "name": "OECD dataset", "description": "data downloaded from OECD", "url": "https://data.oecd.org", "dataset_id": "OECD" }, { "name": "UAZ Indicators", "description": "Collection of indicators, including indicators from FAO, WDI, FEWSNET, CLiMIS, UNICEF, ieconomics.com, UNHCR, DSSAT, WHO, IMF, WHP, ACLDE, World Bank and IOM-DTM", "url": "https://github.com/ml4ai/delphi", "dataset_id": "UAZ" }, { "name": "WGI dataset", "description": "Worldwide Governance Indicators", "url": "https://databank.worldbank.org/source/worldwide-governance-indicators", "dataset_id": "WGI" }, { "name": "WDI dataset", "description": "World Development Indicators", "url": "https://databank.worldbank.org/source/world-development-indicators", "dataset_id": "WDI" }, { "name": "Corruption Perceptions Index", "description": "Transparency International Corruption Perceptions Index The CPI scores and ranks countries/territories based on how corrupt a country\u2019s public sector is perceived to be by experts and business executives. It is a composite index, a combination of 13 surveys and assessments of corruption, collected by a variety of reputable institutions. The CPI is the most widely used indicator of corruption worldwide.", "url": "https://www.transparency.org/", "dataset_id": "TICPI" }, { "name": "SIPRI Military Expenditure", "description": "Military expenditure by country, in millions of US$ at current prices and exchange rates, 1949-2018 - SIPRI 2019", "url": "https://sipri.org/databases/milex", "dataset_id": "SIPRI" }, { "name": "economic fitness dataset", "description": "EconomicFitness", "url": "https://databank.banquemondiale.org/source/economic-fitness", "dataset_id": "EconomicFitness" }, { "name": "Agricultural Market Information System (AMIS)", "description": "The Agricultural Market Information System (AMIS) is an inter-agency platform to enhance food market transparency and policy response for food security. It was launched in 2011 by the G20 Ministers of Agriculture following the global food price hikes in 2007/08 and 2010. Bringing together the principal trading countries of agricultural commodities, AMIS assesses global food supplies (focusing on wheat, maize, rice and soybeans) and provides a platform to coordinate policy action in times of market uncertainty.", "url": "http://www.amis-outlook.org", "dataset_id": "AMIS" }, { "name": "test test test", "description": "testy test", "url": "https://test.com", "dataset_id": "TEST000" }, { "name": "World Press Freedom Index", "description": "Published every year since 2002 by Reporters Without Borders (RSF), the World Press Freedom Index is an important advocacy tool based on the principle of emulation between states. The Index ranks 180 countries and regions\u00a0according to the level of freedom available to journalists.", "url": "https://rsf.org/en", "dataset_id": "WPFI" }, { "name": "Poverty Rate Global DP", "description": "Poverty Rate Global DP", "url": "http://url", "dataset_id": "DPPoverty" }, { "name": "Test Dataset 01", "description": "Test Dataset 01", "url": "http://test01.com/test", "dataset_id": "TEST01" } ]
The newly created dataset TEST01
is returned
TEST01
¶POST /metadata/datasets/{dataset_id}/variables
# define a new variable
test_variable = {
"name": "test variable for test dataset",
"variable_id": "TEST01-01"
}
tv_response = post(f'{datamart_api_url}/metadata/datasets/TEST01/variables', json=test_variable)
print(json.dumps(tv_response.json(), indent=2))
{ "name": "test variable for test dataset", "variable_id": "TEST01-01", "dataset_id": "TEST01", "corresponds_to_property": "PTEST01-TEST01-01" }
NOTE: If the above POST method has already been ran against this Datamart server, then server will respond with:
{
"Error": "Variable TEST01-01 has already been defined in dataset TEST01"
}
Retrieve all variables for the dataset TEST01
response = get(f'{datamart_api_url}/metadata/datasets/TEST01/variables')
print(json.dumps(response.json(), indent=2))
[ { "name": "test variable for test dataset", "variable_id": "TEST01-01", "dataset_id": "TEST01" } ]
The variable TEST01-01
is created in the dataset TEST01
Lets upload some data to the dataset: TEST01 and the variable TEST01-01.
PUT /datasets/{dataset_id}/variables/{variable_id}
import os
def upload_data(file_path, url):
file_name = os.path.basename(file_path)
files = {
'file': (file_name, open(file_path, mode='rb'), 'application/octet-stream')
}
response = put(url, files=files)
if response.status_code == 400:
print(json.dumps(response.json(), indent=2))
else:
print(response.json())
The upload data API validates the input file.
In the example below, the file test_sample_missing_header.csv
is missing a required column main_subject
.
All required columns are:
df = pd.read_csv('test/test_data/test_sample_missing_header.csv')
df
value | value_unit | time | time_precision | country | |
---|---|---|---|---|---|
0 | 1.8 | Annual growth % | 2021-01-01T00:00:00Z | year | belllgium |
1 | 1.9 | Annual growth % | 2022-01-01T00:00:00Z | year | bellgium |
Lets try to upload this file
url = f'{datamart_api_url}/datasets/TEST01/variables/TEST01-01'
file_path = 'test/test_data/test_sample_missing_header.csv'
upload_data(file_path, url)
[ { "Error": "Missing required column: 'main_subject'", "Line Number": 1, "Column": "main_subject", "Description": "The uploaded file is missing a required column: main_subject. Please add the missing column and upload again." } ]
As expected, the API throws an error about missing column main_subject
In the example below, we have the filetest_sample_invalid.csv
This file contains some invalid values in the required columns.
df = pd.read_csv('test/test_data/test_sample_invalid.csv')
df
main_subject | value | value_unit | time | time_precision | country | source | dataset_id | variable_id | |
---|---|---|---|---|---|---|---|---|---|
0 | shdjshduihskdj | fifty | Annual growth % | 20-01-01T00:00:00Z | blah | belllgium | OECD | FAO | fake_gdp_growth |
1 | bellgium | 1.9 | Annual growth % | 2022-01-01T00:00:00Z | year | shdjshduihskdj | OECD | OECD | real_gdp_growth |
Lets try to upload this file
url = f'{datamart_api_url}/datasets/TEST01/variables/TEST01-01'
file_path = 'test/test_data/test_sample_invalid.csv'
upload_data(file_path, url)
[ { "Error": "Value Error: 'fifty'", "Line Number": 2, "Column": "value", "Description": "'fifty' is not a valid number" }, { "Error": "Illegal precision value: 'blah'", "Line Number": 2, "Column": "time_precision", "Description": "Legal precision values are: 'billion years,hundred million years,million years,hundred thousand years,ten thousand years,millennium,century,decade,year,month,day,hour,minute,second'" }, { "Error": "Could not wikify: 'shdjshduihskdj'", "Line Number": 2, "Column": "main_subject", "Description": "Could not find a Wikidata Qnode for the main subject: 'shdjshduihskdj.' Please check for spelling mistakes in the country name." }, { "Error": "Dataset ID in the file: 'FAO' is not same as Dataset ID in the url : 'TEST01'", "Line Number": 2, "Column": "dataset_id", "Description": "Dataset IDs in the input file should match the Dataset Id in the API url" }, { "Error": "Variable ID in the file: 'fake_gdp_growth' is not same as Variable ID in the url : 'TEST01-01'", "Line Number": 2, "Column": "variable_id", "Description": "Variable IDs in the input file should match the Variable Id in the API url" }, { "Error": "Invalid datetime format: '20-01-01T00:00:00Z'", "Line Number": 2, "Column": "time", "Description": "Invalid format to specify time. Valid format: '%Y-%m-%dT%H:%M:%SZ' Explanation: %Y - Year with century as a decimal number (2010, 2020 etc). %m - Month as a zero-padded decimal number(01, 02,..,12). %d - Day of the month as a zero-padded decimal number. (01,02,..,31). %H - Hour (24-hour clock) as a zero-padded decimal number. (00, 01,..,23). %M - Minute as a zero-padded decimal number.(00, 01,...,59). %S - Second as a zero-padded decimal number.(00, 01,...,59). A valid date: '2020-02-27T13:45:44Z'" }, { "Error": "Could not wikify: 'shdjshduihskdj'", "Line Number": 3, "Column": "country", "Description": "Could not find a Wikidata Qnode for the country: 'shdjshduihskdj'. Please check for spelling mistakes in the country name." } ]
The API will list all the errors in the file, which have to be fixed first before it can be uploaded!
We will upload the contents of the file in test_data/test_sample.csv
, which is a valid
file
df = pd.read_csv('test/test_data/test_sample.csv')
df
main_subject | value | value_unit | time | time_precision | country | source | dataset_id | variable_id | |
---|---|---|---|---|---|---|---|---|---|
0 | belllgium | 1.8 | Annual growth % | 2019-01-01T00:00:00Z | year | belllgium | OECD | TEST01 | TEST01-01 |
1 | bellgium | 1.9 | Annual growth % | 2020-01-01T00:00:00Z | year | bellgium | OECD | TEST01 | TEST01-01 |
url = f'{datamart_api_url}/datasets/TEST01/variables/TEST01-01'
file_path = 'test/test_data/test_sample.csv'
upload_data(file_path, url)
2 rows imported!
Get the data for the variable TEST01-01
to check if the was added
response = get(f'{datamart_api_url}/datasets/TEST01/variables/TEST01-01')
df = pd.read_csv(StringIO(response.text))
display(HTML(df.to_html()))
dataset_id | variable_id | variable | main_subject | main_subject_id | value | value_unit | time | time_precision | country | coordinate | stated_in | stated_in_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | TEST01 | TEST01-01 | test variable for test dataset | Belgium | Q31 | 1.8 | Annual growth % | 2019-01-01T00:00:00Z | year | Belgium | POINT(4.6680555555556, 50.641111111111) | OECD | QTEST01Source-0 |
1 | TEST01 | TEST01-01 | test variable for test dataset | Belgium | Q31 | 1.9 | Annual growth % | 2020-01-01T00:00:00Z | year | Belgium | POINT(4.6680555555556, 50.641111111111) | OECD | QTEST01Source-0 |
Success! The 2 rows from 2019 and 2020 were added
Delete the rows added to the dataset for another run of this Jupyter Notebook
response = delete(f'{datamart_api_url}/datasets/TEST01/variables/TEST01-01')
The data has been deleted