This notebook provides a brief demonstration of how to access the World Bank Indicators data using pandas
.
A wrapper for the API is provided as part of the main pandas
distribution, as part of the Remote Data Access support.
#First of all we need to load in the pandas library...
import pandas as pd
#...and the pandas remote data access support for calls to the World Bank Indicators API
from pandas.io import wb
The easiest way to identify an indicator is to search for it by name using a keyword or key phrase.
wb.search('fertility rate')
id | name | source | sourceNote | sourceOrganization | topics | |
---|---|---|---|---|---|---|
6554 | SP.ADO.TFRT | Adolescent fertility rate (births per 1,000 wo... | World Development Indicators | Adolescent fertility rate is the number of bir... | b'United Nations Population Division, World Po... | Social Development ; Health ; Gender |
6594 | SP.DYN.TFRT.IN | Fertility rate, total (births per woman) | World Development Indicators | Total fertility rate represents the number of ... | b'(1) United Nations Population Division. Worl... | Gender ; Health |
6595 | SP.DYN.TFRT.Q1 | Total fertility rate (TFR) (births per woman)... | Health Nutrition and Population Statistics by ... | Total fertility rate (TFR): The number of chil... | b'Household Surveys (DHS, MICS)' | Health |
6596 | SP.DYN.TFRT.Q2 | Total fertility rate (TFR) (births per woman)... | Health Nutrition and Population Statistics by ... | Total fertility rate (TFR): The number of chil... | b'Household Surveys (DHS, MICS)' | Health |
6597 | SP.DYN.TFRT.Q3 | Total fertility rate (TFR) (births per woman)... | Health Nutrition and Population Statistics by ... | Total fertility rate (TFR): The number of chil... | b'Household Surveys (DHS, MICS)' | Health |
6598 | SP.DYN.TFRT.Q4 | Total fertility rate (TFR) (births per woman)... | Health Nutrition and Population Statistics by ... | Total fertility rate (TFR): The number of chil... | b'Household Surveys (DHS, MICS)' | Health |
6599 | SP.DYN.TFRT.Q5 | Total fertility rate (TFR) (births per woman)... | Health Nutrition and Population Statistics by ... | Total fertility rate (TFR): The number of chil... | b'Household Surveys (DHS, MICS)' | Health |
6602 | SP.DYN.WFRT | Wanted fertility rate (births per woman) | World Development Indicators | Wanted fertility rate is an estimate of what t... | b'Demographic and Health Surveys by ICF Intern... | Health ; Gender |
6603 | SP.DYN.WFRT.Q1 | Total wanted fertility rate (births per woman)... | Health Nutrition and Population Statistics by ... | Total wanted fertility rate: Total wanted fert... | b'Household Surveys (DHS, MICS)' | Health |
6604 | SP.DYN.WFRT.Q2 | Total wanted fertility rate (births per woman)... | Health Nutrition and Population Statistics by ... | Total wanted fertility rate: Total wanted fert... | b'Household Surveys (DHS, MICS)' | Health |
6605 | SP.DYN.WFRT.Q3 | Total wanted fertility rate (births per woman)... | Health Nutrition and Population Statistics by ... | Total wanted fertility rate: Total wanted fert... | b'Household Surveys (DHS, MICS)' | Health |
6606 | SP.DYN.WFRT.Q4 | Total wanted fertility rate (births per woman)... | Health Nutrition and Population Statistics by ... | Total wanted fertility rate: Total wanted fert... | b'Household Surveys (DHS, MICS)' | Health |
6607 | SP.DYN.WFRT.Q5 | Total wanted fertility rate (births per woman)... | Health Nutrition and Population Statistics by ... | Total wanted fertility rate: Total wanted fert... | b'Household Surveys (DHS, MICS)' | Health |
#We can also get a full list of indicators
indicators=wb.get_indicators()
#Preview first few rows of indicators list
indicators[:5]
id | name | source | sourceNote | sourceOrganization | topics | |
---|---|---|---|---|---|---|
0 | 1.0.HCount.1.25usd | Poverty Headcount ($1.25 a day) | LAC Equity Lab | The poverty headcount index measures the propo... | b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... | Poverty |
1 | 1.0.HCount.10usd | Under Middle Class ($10 a day) Headcount | LAC Equity Lab | The poverty headcount index measures the propo... | b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... | Poverty |
2 | 1.0.HCount.2.5usd | Poverty Headcount ($2.50 a day) | LAC Equity Lab | The poverty headcount index measures the propo... | b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... | Poverty |
3 | 1.0.HCount.Mid10to50 | Middle Class ($10-50 a day) Headcount | LAC Equity Lab | The poverty headcount index measures the propo... | b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... | Poverty |
4 | 1.0.HCount.Ofcl | Official Moderate Poverty Rate-National | LAC Equity Lab | The poverty headcount index measures the propo... | b'LAC Equity Lab tabulations of data from Nati... | Poverty |
If you know the identifier - or part of the identifier - for a particular indicator, you can look up details for it directly. Use the * character as a wildcard character.
wb.search('gdp.*capita.*const')
id | name | source | sourceNote | sourceOrganization | topics | |
---|---|---|---|---|---|---|
700 | 6.0.GDPpc_constant | GDP per capita, PPP (constant 2011 internation... | LAC Equity Lab | GDP per capita based on purchasing power parit... | b'NULWorld Development Indicators (World Bank)L' | Economy & Growth |
3496 | GDPPCKD | GDP per Capita, constant US$, millions | GEP Economic Prospects | GDP per capita is gross domestic product divid... | b'World Bank staff calculations based on World... | Economy & Growth |
5530 | NY.GDP.PCAP.KD | GDP per capita (constant 2005 US$) | World Development Indicators | GDP per capita is gross domestic product divid... | b'World Bank national accounts data, and OECD ... | Economy & Growth |
5532 | NY.GDP.PCAP.KN | GDP per capita (constant LCU) | World Development Indicators | GDP per capita is gross domestic product divid... | b'World Bank national accounts data, and OECD ... | Economy & Growth |
5534 | NY.GDP.PCAP.PP.KD | GDP per capita, PPP (constant 2011 internation... | World Development Indicators | GDP per capita based on purchasing power parit... | b'World Bank, International Comparison Program... | Economy & Growth |
When retrieving a dataset, we can specifiy which country, countries or regions we want the data for. The locations are identified using the appropriate ISO-2 code. To look up countries we can download the full country list.
#We can get a list of the countries and regions that indicator data may be available for
countries=wb.get_countries()
#Preview first few rows of countries list
countries[:5]
adminregion | capitalCity | iso3c | incomeLevel | iso2c | latitude | lendingType | longitude | name | region | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Oranjestad | ABW | High income: nonOECD | AW | 12.5167 | Not classified | -70.0167 | Aruba | Latin America & Caribbean (all income levels) | |
1 | South Asia | Kabul | AFG | Low income | AF | 34.5228 | IDA | 69.1761 | Afghanistan | South Asia |
2 | AFR | Aggregates | A9 | Aggregates | Africa | Aggregates | ||||
3 | Sub-Saharan Africa (developing only) | Luanda | AGO | Upper middle income | AO | -8.81155 | IBRD | 13.242 | Angola | Sub-Saharan Africa (all income levels) |
4 | Europe & Central Asia (developing only) | Tirane | ALB | Upper middle income | AL | 41.3317 | IBRD | 19.8172 | Albania | Europe & Central Asia (all income levels) |
#pandas dataframes allow us to search within the country list for a particular country
countries[ countries['name'] == 'Angola' ]
adminregion | capitalCity | iso3c | incomeLevel | iso2c | latitude | lendingType | longitude | name | region | |
---|---|---|---|---|---|---|---|---|---|---|
3 | Sub-Saharan Africa (developing only) | Luanda | AGO | Upper middle income | AO | -8.81155 | IBRD | 13.242 | Angola | Sub-Saharan Africa (all income levels) |
Once you have identified one or more indicators for which you would like to download a dataset, you need to identify the year or range of years, and the country, countries or regions (identified via their ISO-2 code) for which you would like the data.
#Download data from the World Bank API into a dataframe
df = wb.download(
#Use the indicator attribute to identify which indicator or indicators to download
indicator='NY.GDP.PCAP.KD',
#Use the country attribute to identify the countries you want data for
country=['US', 'CA', 'MX'],
#Identify the first year for which you want the data, as an integer or a string
start='2008',
#Identify the last year for which you want the data, as an integer or a string
end=2010
)
#Show the dataframe
df
NY.GDP.PCAP.KD | ||
---|---|---|
country | year | |
Canada | 2010 | 36466.815112 |
2009 | 35671.659294 | |
2008 | 37088.020368 | |
Mexico | 2010 | 8084.629000 |
2009 | 7788.271761 | |
2008 | 8275.809458 | |
United States | 2010 | 43952.436548 |
2009 | 43234.451155 | |
2008 | 44872.653626 |
#To download data for multiple indicators, specify them as a list
wb.download( indicator=['SP.DYN.TFRT.IN','NY.GDP.PCAP.KD'], country=['US','GB'], start=2008, end=2010 )
SP.DYN.TFRT.IN | NY.GDP.PCAP.KD | ||
---|---|---|---|
country | year | ||
United Kingdom | 2010 | 1.920 | 37600.293399 |
2009 | 1.890 | 37277.481537 | |
2008 | 1.910 | 39608.431481 | |
United States | 2010 | 1.931 | 43952.436548 |
2009 | 2.002 | 43234.451155 | |
2008 | 2.072 | 44872.653626 |
#We can download data for a single year by setting the start and end dates to the same year
#To download data for a single country, you do not need to specify it as a list
df = wb.download( indicator='NY.GDP.PCAP.KD', country='US', start=2008, end=2008 )
#Show the dataframe
df
NY.GDP.PCAP.KD | ||
---|---|---|
country | year | |
United States | 2008 | 44872.653626 |
#To download the data for all countries, set the country attribute to 'all'
df = wb.download( indicator='SP.DYN.TFRT.IN', country='all', start=2010, end=2010 )
#Show a preview of the the first few rows of the dataframe
df[:10]
SP.DYN.TFRT.IN | ||
---|---|---|
country | year | |
Andean Region | 2010 | NaN |
Arab World | 2010 | 3.297409 |
Caribbean small states | 2010 | 2.224484 |
Central Europe and the Baltics | 2010 | 1.445135 |
East Asia & Pacific (all income levels) | 2010 | 1.817921 |
East Asia & Pacific (developing only) | 2010 | 1.856321 |
East Asia and the Pacific (IFC classification) | 2010 | NaN |
Euro area | 2010 | 1.573820 |
Europe & Central Asia (all income levels) | 2010 | 1.723861 |
Europe & Central Asia (developing only) | 2010 | 1.979588 |
Notice that selecting all countries also pulls indicators back for different regional groupings as well as countries.
pandas
support for remote data access makes it easy for us to get data from the World Bank Indicators API into a pandas
dataframe, where we can start to work with it.