The World Factbook provides basic intelligence on the history, people, government, economy, energy, geography, environment, communications, transportation, military, terrorism, and transnational issues for 266 world entities. The World Factbook, produced for US policymakers and coordinated throughout the US Intelligence Community, presents the basic realities about the world in which we live. Information in The Factbook is collected from – and coordinated with – a wide variety of US Government agencies, as well as from hundreds of published sources.
In this project, I'm using data from the CIA World Factbook, which contains demographic information like the following:
Population — the global population. Population growth — the annual population growth rate, as a percentage. Area — the total land and water area.
This dataset is available here
In this project, I'll use SQL in Jupyter Notebook to analyze data from this database
Install ipython-sql by running the following code:
!conda install -yc conda-forge ipython-sql
Collecting package metadata (repodata.json): ...working... done Solving environment: ...working... Warning: >10 possible package resolutions (only showing differing packages): - https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8, https://repo.continuum.io/pkgs/main/win-64/pandoc-1.19.2.1-hb2460c7_1.tar.bz2/win-64::pandoc-1.19.2.1-hb2460c7_1 - defaults/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8, https://repo.continuum.io/pkgs/main/win-64/pandoc-1.19.2.1-hb2460c7_1.tar.bz2/win-64::pandoc-1.19.2.1-hb2460c7_1 - defaults/win-64::intel-openmp-2018.0.0-hd92c6cd_8, https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/pandoc-1.19.2.1-hb2460c7_1.tar.bz2/win-64::pandoc-1.19.2.1-hb2460c7_1 - defaults/win-64::conda-env-2.6.0-h36134e3_1, defaults/win-64::intel-openmp-2018.0.0-hd92c6cd_8, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/pandoc-1.19.2.1-hb2460c7_1.tar.bz2/win-64::pandoc-1.19.2.1-hb2460c7_1 - defaults/win-64::intel-openmp-2018.0.0-hd92c6cd_8, defaults/win-64::pandoc-1.19.2.1-hb2460c7_1, https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3 - defaults/win-64::conda-env-2.6.0-h36134e3_1, defaults/win-64::intel-openmp-2018.0.0-hd92c6cd_8, defaults/win-64::pandoc-1.19.2.1-hb2460c7_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3 - defaults/win-64::pandoc-1.19.2.1-hb2460c7_1, https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8 - defaults/win-64::conda-env-2.6.0-h36134e3_1, defaults/win-64::pandoc-1.19.2.1-hb2460c7_1, https://repo.continuum.io/pkgs/main/win-64/console_shortcut-0.1.1-h6bb2dd7_3.tar.bz2/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8 - defaults/win-64::console_shortcut-0.1.1-h6bb2dd7_3, defaults/win-64::pandoc-1.19.2.1-hb2460c7_1, https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8 - defaults/win-64::console_shortcut-0.1.1-h6bb2dd7_3, https://repo.continuum.io/pkgs/main/win-64/conda-env-2.6.0-h36134e3_1.tar.bz2/win-64::conda-env-2.6.0-h36134e3_1, https://repo.continuum.io/pkgs/main/win-64/intel-openmp-2018.0.0-hd92c6cd_8.tar.bz2/win-64::intel-openmp-2018.0.0-hd92c6cd_8, https://repo.continuum.io/pkgs/main/win-64/pandoc-1.19.2.1-hb2460c7_1.tar.bz2/win-64::pandoc-1.19.2.1-hb2460c7_1 ... and othersdone ## Package Plan ## environment location: C:\Users\basun\Anaconda3 added / updated specs: - ipython-sql The following packages will be downloaded: package | build ---------------------------|----------------- _anaconda_depends-5.1.0 | py36_2 5 KB anaconda-custom | py36_1 2 KB ca-certificates-2021.10.8 | h5b45459_0 176 KB conda-forge certifi-2021.5.30 | py36ha15d459_0 142 KB conda-forge conda-4.10.3 | py36ha15d459_2 3.1 MB conda-forge importlib-metadata-4.8.1 | py36ha15d459_0 32 KB conda-forge ipython-sql-0.3.9 | pyhd8ed1ab_1004 18 KB conda-forge openssl-1.0.2u | hfa6e2cd_0 5.4 MB conda-forge prettytable-2.5.0 | pyhd8ed1ab_0 25 KB conda-forge python_abi-3.6 | 2_cp36m 4 KB conda-forge sqlparse-0.4.2 | pyhd8ed1ab_0 34 KB conda-forge typing_extensions-4.0.1 | pyha770c72_0 26 KB conda-forge zipp-3.7.0 | pyhd8ed1ab_0 12 KB conda-forge ------------------------------------------------------------ Total: 9.0 MB The following NEW packages will be INSTALLED: _anaconda_depends pkgs/main/win-64::_anaconda_depends-5.1.0-py36_2 importlib-metadata conda-forge/win-64::importlib-metadata-4.8.1-py36ha15d459_0 ipython-sql conda-forge/noarch::ipython-sql-0.3.9-pyhd8ed1ab_1004 prettytable conda-forge/noarch::prettytable-2.5.0-pyhd8ed1ab_0 python_abi conda-forge/win-64::python_abi-3.6-2_cp36m sqlparse conda-forge/noarch::sqlparse-0.4.2-pyhd8ed1ab_0 typing_extensions conda-forge/noarch::typing_extensions-4.0.1-pyha770c72_0 zipp conda-forge/noarch::zipp-3.7.0-pyhd8ed1ab_0 The following packages will be UPDATED: ca-certificates pkgs/main::ca-certificates-2021.7.5-h~ --> conda-forge::ca-certificates-2021.10.8-h5b45459_0 conda pkgs/main::conda-4.10.3-py36haa95532_0 --> conda-forge::conda-4.10.3-py36ha15d459_2 The following packages will be SUPERSEDED by a higher-priority channel: certifi pkgs/main::certifi-2021.5.30-py36haa9~ --> conda-forge::certifi-2021.5.30-py36ha15d459_0 openssl pkgs/main::openssl-1.0.2u-he774522_0 --> conda-forge::openssl-1.0.2u-hfa6e2cd_0 The following packages will be DOWNGRADED: anaconda 5.1.0-py36_2 --> custom-py36_1 Downloading and Extracting Packages _anaconda_depends-5. | 5 KB | | 0% _anaconda_depends-5. | 5 KB | ########## | 100% _anaconda_depends-5. | 5 KB | ########## | 100% python_abi-3.6 | 4 KB | | 0% python_abi-3.6 | 4 KB | ########## | 100% python_abi-3.6 | 4 KB | ########## | 100% certifi-2021.5.30 | 142 KB | | 0% certifi-2021.5.30 | 142 KB | ########## | 100% certifi-2021.5.30 | 142 KB | ########## | 100% anaconda-custom | 2 KB | | 0% anaconda-custom | 2 KB | ########## | 100% ipython-sql-0.3.9 | 18 KB | | 0% ipython-sql-0.3.9 | 18 KB | ########## | 100% ca-certificates-2021 | 176 KB | | 0% ca-certificates-2021 | 176 KB | ########## | 100% ca-certificates-2021 | 176 KB | ########## | 100% conda-4.10.3 | 3.1 MB | | 0% conda-4.10.3 | 3.1 MB | #####4 | 54% conda-4.10.3 | 3.1 MB | ########## | 100% openssl-1.0.2u | 5.4 MB | | 0% openssl-1.0.2u | 5.4 MB | ###3 | 33% openssl-1.0.2u | 5.4 MB | ########## | 100% openssl-1.0.2u | 5.4 MB | ########## | 100% sqlparse-0.4.2 | 34 KB | | 0% sqlparse-0.4.2 | 34 KB | ########## | 100% sqlparse-0.4.2 | 34 KB | ########## | 100% typing_extensions-4. | 26 KB | | 0% typing_extensions-4. | 26 KB | ########## | 100% zipp-3.7.0 | 12 KB | | 0% zipp-3.7.0 | 12 KB | ########## | 100% prettytable-2.5.0 | 25 KB | | 0% prettytable-2.5.0 | 25 KB | ######4 | 65% prettytable-2.5.0 | 25 KB | ########## | 100% importlib-metadata-4 | 32 KB | | 0% importlib-metadata-4 | 32 KB | ########## | 100% importlib-metadata-4 | 32 KB | ########## | 100% Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done
The environment is inconsistent, please check the package plan carefully The following packages are causing the inconsistency: - https://repo.continuum.io/pkgs/main/win-64/anaconda-5.1.0-py36_2.tar.bz2/win-64::anaconda==5.1.0=py36_2 ==> WARNING: A newer version of conda exists. <== current version: 4.10.3 latest version: 4.11.0 Please update conda by running $ conda update -n base conda
%%capture
%load_ext sql
%sql sqlite:///factbook.db
Checking all the tables available in factbook db
%%sql
SELECT *
FROM sqlite_master
WHERE type='table';
* sqlite:///factbook.db Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
I'll be using facts table to retrieve the required data and analyze the dataset
%%sql
select * from facts
limit 5;
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 |
From the above result, I get the different columns in the dataset and for clarity, below is the description of each column
Calculating summary statistics to check for any outlier countries. Checking the minimum population, maximum population, minimum population_growth and maximum population_growth
%%sql
select min(population) as 'Minimum Population',
max(population) as 'Maximum Population',
min(population_growth) as 'Minimun Population Growth',
max(population_growth) as 'Maximum Population growth'
from facts;
* sqlite:///factbook.db Done.
Minimum Population | Maximum Population | Minimun Population Growth | Maximum Population growth |
---|---|---|---|
0 | 7256490011 | 0.0 | 4.02 |
The above query shows that there is a country which has zero population and the country with maximum populaton is same as whole world's population i.e 7.2 billion. This does not seem right so let's check the names of these countries
%%sql
select name, population from facts
where population in (SELECT MAX(population) FROM facts)
or population in (SELECT MIN(population) FROM facts);
* sqlite:///factbook.db Done.
name | population |
---|---|
Antarctica | 0 |
World | 7256490011 |
It seems like the table contains a row for the whole world, which explains the population of over 7.2 billion. It also seems like the table contains a row for Antarctica, which explains the population of 0. This seems to match the CIA Factbook page for Antarctica:
Calculating the minimum and maximum population and population growth exculding the World to get the maximum population amongst all countries
%%sql
select min(population) as 'Minimum Population',
max(population) as 'Maximum Population',
min(population_growth) as 'Minimum Population growth',
max(population_growth) as 'Maximum Population growth'
from facts
where name !='World';
* sqlite:///factbook.db Done.
Minimum Population | Maximum Population | Minimum Population growth | Maximum Population growth |
---|---|---|---|
0 | 1367485388 | 0.0 | 4.02 |
The country with highest population has 1.3 billion people. Now getting the Average population and area excludig World from name to find more accurate averages
%%sql
select round(avg(population),4) as 'Average Population',
round(avg(area),4) as 'Average Area'
from facts
where name != 'World';
* sqlite:///factbook.db Done.
Average Population | Average Area |
---|---|
32242666.5685 | 555093.5462 |
Now we know that average population is 32 million and average area is 555094 sq km . Average population at 32 million is pretty surprising. It forces me to think that most people are concentrated in certain parts of the world.
Checking the most populous country and largest country in the world.
%%sql
select name, max(population) as Population from facts
where name != 'World';
* sqlite:///factbook.db Done.
name | Population |
---|---|
China | 1367485388 |
%%sql
select name, max(area) as Area from facts
where name != 'World';
* sqlite:///factbook.db Done.
name | Area |
---|---|
Russia | 17098242 |
Now that we know, China is the most populous country and Russia has the largest area, I want to find answers to few more questions regarding population and area of all countries.
Finding all the countries that have above average population
%%sql
select name, population from facts
where (population > (select round(avg(population),2)
from facts where name != 'World'))
and name != 'World'
order by population desc;
* sqlite:///factbook.db Done.
name | population |
---|---|
China | 1367485388 |
India | 1251695584 |
European Union | 513949445 |
United States | 321368864 |
Indonesia | 255993674 |
Brazil | 204259812 |
Pakistan | 199085847 |
Nigeria | 181562056 |
Bangladesh | 168957745 |
Russia | 142423773 |
Japan | 126919659 |
Mexico | 121736809 |
Philippines | 100998376 |
Ethiopia | 99465819 |
Vietnam | 94348835 |
Egypt | 88487396 |
Iran | 81824270 |
Germany | 80854408 |
Turkey | 79414269 |
Congo, Democratic Republic of the | 79375136 |
Thailand | 67976405 |
France | 66553766 |
United Kingdom | 64088222 |
Italy | 61855120 |
Burma | 56320206 |
South Africa | 53675563 |
Tanzania | 51045882 |
Korea, South | 49115196 |
Spain | 48146134 |
Colombia | 46736728 |
Kenya | 45925301 |
Ukraine | 44429471 |
Argentina | 43431886 |
Algeria | 39542166 |
Poland | 38562189 |
Uganda | 37101745 |
Iraq | 37056169 |
Sudan | 36108853 |
Canada | 35099836 |
Morocco | 33322699 |
Afghanistan | 32564342 |
There are just a handfull of countries with population more than the average population which means most of the country in the world has below average population. This gives us a clear picture of where the most people in the world live.
I want to see countries with below average area
%%sql
select name, area from facts where area <
(select round(avg(area),3) from facts
where name != 'World')
order by area desc;
* sqlite:///factbook.db Done.
name | area |
---|---|
Yemen | 527968 |
Thailand | 513120 |
Spain | 505370 |
Turkmenistan | 488100 |
Cameroon | 475440 |
Papua New Guinea | 462840 |
Sweden | 450295 |
Uzbekistan | 447400 |
Morocco | 446550 |
Iraq | 438317 |
Paraguay | 406752 |
Zimbabwe | 390757 |
Japan | 377915 |
Germany | 357022 |
Congo, Republic of the | 342000 |
Finland | 338145 |
Vietnam | 331210 |
Malaysia | 329847 |
Norway | 323802 |
Cote d'Ivoire | 322463 |
Poland | 312685 |
Oman | 309500 |
Italy | 301340 |
Philippines | 300000 |
Ecuador | 283561 |
Burkina Faso | 274200 |
New Zealand | 267710 |
Gabon | 267667 |
Western Sahara | 266000 |
Guinea | 245857 |
United Kingdom | 243610 |
Uganda | 241038 |
Ghana | 238533 |
Romania | 238391 |
Laos | 236800 |
Guyana | 214969 |
Belarus | 207600 |
Kyrgyzstan | 199951 |
Senegal | 196722 |
Syria | 185180 |
Cambodia | 181035 |
Uruguay | 176215 |
Suriname | 163820 |
Tunisia | 163610 |
Bangladesh | 148460 |
Nepal | 147181 |
Tajikistan | 144100 |
Greece | 131957 |
Nicaragua | 130370 |
Korea, North | 120538 |
Malawi | 118484 |
Eritrea | 117600 |
Benin | 112622 |
Honduras | 112090 |
Liberia | 111369 |
Bulgaria | 110879 |
Cuba | 110860 |
Guatemala | 108889 |
Iceland | 103000 |
Korea, South | 99720 |
Hungary | 93028 |
Portugal | 92090 |
Jordan | 89342 |
Azerbaijan | 86600 |
Austria | 83871 |
United Arab Emirates | 83600 |
Czech Republic | 78867 |
Serbia | 77474 |
Panama | 75420 |
Sierra Leone | 71740 |
Ireland | 70273 |
Georgia | 69700 |
Sri Lanka | 65610 |
Lithuania | 65300 |
Latvia | 64589 |
Svalbard | 62045 |
Togo | 56785 |
Croatia | 56594 |
British Indian Ocean Territory | 54400 |
Bosnia and Herzegovina | 51197 |
Costa Rica | 51100 |
Slovakia | 49035 |
Dominican Republic | 48670 |
Estonia | 45228 |
Denmark | 43094 |
Netherlands | 41543 |
Switzerland | 41277 |
Bhutan | 38394 |
Guinea-Bissau | 36125 |
Taiwan | 35980 |
Moldova | 33851 |
Belgium | 30528 |
Lesotho | 30355 |
Armenia | 29743 |
Solomon Islands | 28896 |
Albania | 28748 |
Equatorial Guinea | 28051 |
Burundi | 27830 |
Haiti | 27750 |
Rwanda | 26338 |
Macedonia | 25713 |
Djibouti | 23200 |
Belize | 22966 |
El Salvador | 21041 |
Israel | 20770 |
Slovenia | 20273 |
New Caledonia | 18575 |
Fiji | 18274 |
Kuwait | 17818 |
Swaziland | 17364 |
Timor-Leste | 14874 |
Bahamas, The | 13880 |
Montenegro | 13812 |
Puerto Rico | 13791 |
Vanuatu | 12189 |
Falkland Islands (Islas Malvinas) | 12173 |
Qatar | 11586 |
Gambia, The | 11300 |
Jamaica | 10991 |
Kosovo | 10887 |
Lebanon | 10400 |
Cyprus | 9251 |
West Bank | 5860 |
Brunei | 5765 |
Trinidad and Tobago | 5128 |
French Polynesia | 4167 |
Cabo Verde | 4033 |
South Georgia and South Sandwich Islands | 3903 |
Samoa | 2831 |
Luxembourg | 2586 |
Comoros | 2235 |
Mauritius | 2040 |
Virgin Islands | 1910 |
Faroe Islands | 1393 |
Hong Kong | 1108 |
Sao Tome and Principe | 964 |
Turks and Caicos Islands | 948 |
Kiribati | 811 |
Bahrain | 760 |
Dominica | 751 |
Tonga | 747 |
Micronesia, Federated States of | 702 |
Singapore | 697 |
Saint Lucia | 616 |
Isle of Man | 572 |
Guam | 544 |
Andorra | 468 |
Northern Mariana Islands | 464 |
Palau | 459 |
Seychelles | 455 |
Curacao | 444 |
Antigua and Barbuda | 442 |
Barbados | 430 |
Heard Island and McDonald Islands | 412 |
Saint Vincent and the Grenadines | 389 |
Jan Mayen | 377 |
Gaza Strip | 360 |
Grenada | 344 |
Malta | 316 |
Saint Helena, Ascension, and Tristan da Cunha | 308 |
Maldives | 298 |
Cayman Islands | 264 |
Saint Kitts and Nevis | 261 |
Niue | 260 |
Saint Pierre and Miquelon | 242 |
Cook Islands | 236 |
American Samoa | 199 |
Marshall Islands | 181 |
Aruba | 180 |
Liechtenstein | 160 |
British Virgin Islands | 151 |
Wallis and Futuna | 142 |
Christmas Island | 135 |
Dhekelia | 130 |
Akrotiri | 123 |
Jersey | 116 |
Montserrat | 102 |
Anguilla | 91 |
Guernsey | 78 |
San Marino | 61 |
Saint Martin | 54 |
Bermuda | 54 |
Bouvet Island | 49 |
Pitcairn Islands | 47 |
Norfolk Island | 36 |
Sint Maarten | 34 |
Macau | 28 |
Tuvalu | 26 |
Nauru | 21 |
Cocos (Keeling) Islands | 14 |
Tokelau | 12 |
Paracel Islands | 7 |
Clipperton Island | 6 |
Gibraltar | 6 |
Wake Island | 6 |
Ashmore and Cartier Islands | 5 |
Navassa Island | 5 |
Spratly Islands | 5 |
Coral Sea Islands | 3 |
Monaco | 2 |
Holy See (Vatican City) | 0 |
It looks like most of the countries have below average area. Checking population of all the countries which have below average area
%%sql
select name, population, area from facts
where
area < (select avg(area) from facts)
order by population desc
Limit 30;
* sqlite:///factbook.db Done.
name | population | area |
---|---|---|
Bangladesh | 168957745 | 148460 |
Japan | 126919659 | 377915 |
Philippines | 100998376 | 300000 |
Vietnam | 94348835 | 331210 |
Germany | 80854408 | 357022 |
Thailand | 67976405 | 513120 |
United Kingdom | 64088222 | 243610 |
Italy | 61855120 | 301340 |
Korea, South | 49115196 | 99720 |
Spain | 48146134 | 505370 |
Poland | 38562189 | 312685 |
Uganda | 37101745 | 241038 |
Iraq | 37056169 | 438317 |
Morocco | 33322699 | 446550 |
Nepal | 31551305 | 147181 |
Malaysia | 30513848 | 329847 |
Uzbekistan | 29199942 | 447400 |
Yemen | 26737317 | 527968 |
Ghana | 26327649 | 238533 |
Korea, North | 24983205 | 120538 |
Cameroon | 23739218 | 475440 |
Taiwan | 23415126 | 35980 |
Cote d'Ivoire | 23295302 | 322463 |
Sri Lanka | 22053488 | 65610 |
Romania | 21666350 | 238391 |
Burkina Faso | 18931686 | 274200 |
Malawi | 17964697 | 118484 |
Syria | 17064854 | 185180 |
Netherlands | 16947904 | 41543 |
Ecuador | 15868396 | 283561 |
%%sql
select name, population, area from facts
where population > (select avg(population) from facts)
and
area < (select avg(area) from facts)
order by Population desc, area asc;
* sqlite:///factbook.db Done.
name | population | area |
---|---|---|
Bangladesh | 168957745 | 148460 |
Japan | 126919659 | 377915 |
Philippines | 100998376 | 300000 |
Vietnam | 94348835 | 331210 |
Germany | 80854408 | 357022 |
Thailand | 67976405 | 513120 |
United Kingdom | 64088222 | 243610 |
Bangladesh, Germany, Japan, Philippines, Thailand, United Kingdom and Vietnam have below average area and above average population which indicates that these countries have very high population density. Bangladesh is the smallest country with above average population.
Now let's check the countries with above average population and above average area
%%sql
select name, population, area from facts
where population > (select round(avg(population),3) from facts
where name !='World')
and area > (select round(avg(area),2) from facts)
order by population desc;
* sqlite:///factbook.db Done.
name | population | area |
---|---|---|
China | 1367485388 | 9596960 |
India | 1251695584 | 3287263 |
European Union | 513949445 | 4324782 |
United States | 321368864 | 9826675 |
Indonesia | 255993674 | 1904569 |
Brazil | 204259812 | 8515770 |
Pakistan | 199085847 | 796095 |
Nigeria | 181562056 | 923768 |
Russia | 142423773 | 17098242 |
Mexico | 121736809 | 1964375 |
Ethiopia | 99465819 | 1104300 |
Egypt | 88487396 | 1001450 |
Iran | 81824270 | 1648195 |
Turkey | 79414269 | 783562 |
Congo, Democratic Republic of the | 79375136 | 2344858 |
France | 66553766 | 643801 |
Burma | 56320206 | 676578 |
South Africa | 53675563 | 1219090 |
Tanzania | 51045882 | 947300 |
Colombia | 46736728 | 1138910 |
Kenya | 45925301 | 580367 |
Ukraine | 44429471 | 603550 |
Argentina | 43431886 | 2780400 |
Algeria | 39542166 | 2381741 |
Sudan | 36108853 | 1861484 |
Canada | 35099836 | 9984670 |
Afghanistan | 32564342 | 652230 |
China, India, European Union, United States, Indonesia, Brazil, Pakistan, Nigeria, Russia, Mexico, Ethiopia, Egypt, Iran, Turkey Democratic Republic of the Congo, France, Burma, South Africa, Tanzania, Colombia, Kenya, Ukraine, Argentina, Algeria, Sudan, Canada and Afghanistan have above average population as well as above average area. From the above list it looks like European Union is also considered as country.
The European Union is not a state, but a unique partnership between European countries, known as Member States. Together they cover much of the European continent. The EU is home to over 446 million people, which corresponds to around 6 % of the world's population. [Click](https://en.wikipedia.org/wiki/European_Union) to read more.
It's only fair to remove European Union to check the countries with above average population and above average area.
%%sql
select name, population, area from facts
where population > (select round(avg(population),3) from facts
where name !='World')
and area > (select round(avg(area),2) from facts)
and name != 'European Union'
order by population desc;
* sqlite:///factbook.db Done.
name | population | area |
---|---|---|
China | 1367485388 | 9596960 |
India | 1251695584 | 3287263 |
United States | 321368864 | 9826675 |
Indonesia | 255993674 | 1904569 |
Brazil | 204259812 | 8515770 |
Pakistan | 199085847 | 796095 |
Nigeria | 181562056 | 923768 |
Russia | 142423773 | 17098242 |
Mexico | 121736809 | 1964375 |
Ethiopia | 99465819 | 1104300 |
Egypt | 88487396 | 1001450 |
Iran | 81824270 | 1648195 |
Turkey | 79414269 | 783562 |
Congo, Democratic Republic of the | 79375136 | 2344858 |
France | 66553766 | 643801 |
Burma | 56320206 | 676578 |
South Africa | 53675563 | 1219090 |
Tanzania | 51045882 | 947300 |
Colombia | 46736728 | 1138910 |
Kenya | 45925301 | 580367 |
Ukraine | 44429471 | 603550 |
Argentina | 43431886 | 2780400 |
Algeria | 39542166 | 2381741 |
Sudan | 36108853 | 1861484 |
Canada | 35099836 | 9984670 |
Afghanistan | 32564342 | 652230 |
I think it will be interesting to check all the countries with below average population and above average area and I will exclude World from population and European Union from area for fair results
%%sql
select name, population, area from facts
where population < (select avg(population) from facts)
and area > (select avg(area) from facts where name NOT IN ('World','European Union'))
order by population asc, area desc;
* sqlite:///factbook.db Done.
name | population | area |
---|---|---|
Greenland | 57733 | 2166086 |
Botswana | 2182719 | 581730 |
Namibia | 2212307 | 824292 |
Mongolia | 2992908 | 1564116 |
Mauritania | 3596702 | 1030700 |
Central African Republic | 5391539 | 622984 |
Libya | 6411776 | 1759540 |
Somalia | 10616380 | 637657 |
Bolivia | 10800882 | 1098581 |
South Sudan | 12042910 | 644329 |
Zambia | 15066266 | 752618 |
Mali | 16955536 | 1240192 |
Chile | 17508260 | 756102 |
Kazakhstan | 18157122 | 2724900 |
Angola | 19625353 | 1246700 |
Australia | 22751014 | 7741220 |
Madagascar | 23812681 | 587041 |
Mozambique | 25303113 | 799380 |
Saudi Arabia | 27752316 | 2149690 |
Venezuela | 29275460 | 912050 |
Peru | 30444999 | 1285216 |
Afghanistan | 32564342 | 652230 |
Canada | 35099836 | 9984670 |
Sudan | 36108853 | 1861484 |
Algeria | 39542166 | 2381741 |
Argentina | 43431886 | 2780400 |
Ukraine | 44429471 | 603550 |
Kenya | 45925301 | 580367 |
Colombia | 46736728 | 1138910 |
Tanzania | 51045882 | 947300 |
South Africa | 53675563 | 1219090 |
Burma | 56320206 | 676578 |
Burma, South Africa, Tanzania, Colombia, Kenya, Ukraine, Argentina, Algeria, Sudan, Canada, Afghanistan, Peru, Venezuela, Saudi Arabia, Mozambique, Madagascar, Australia, Angola, Kazakhstan, Chile, Mali, Zambia, South Sudan, Bolivia, Somalia, Libya, Central African Republic, Mauritania, Mongolia, Namibia, Botswana, Greenland have above average area and below average population. Amongst the above countries, Greenland is the largest and has the least population.
We have already checked the most populous country in the begining. China is the most populated country. Checking country with most Population growth
%%sql
select name, population from facts
order by population asc
limit 20;
* sqlite:///factbook.db Done.
name | population |
---|---|
Ashmore and Cartier Islands | None |
Coral Sea Islands | None |
Heard Island and McDonald Islands | None |
Clipperton Island | None |
French Southern and Antarctic Lands | None |
Bouvet Island | None |
Jan Mayen | None |
British Indian Ocean Territory | None |
South Georgia and South Sandwich Islands | None |
Navassa Island | None |
Wake Island | None |
United States Pacific Island Wildlife Refuges | None |
Paracel Islands | None |
Spratly Islands | None |
Arctic Ocean | None |
Atlantic Ocean | None |
Indian Ocean | None |
Pacific Ocean | None |
Southern Ocean | None |
Antarctica | 0 |
There are many countries that show population as 'None' and it can be due to various reasons. On closer look names listed in above list are mostly Oceans or small islands. For better analysis, we will not consider Oceans or any island with population value as none.
%%sql
select name, population, area_land
from facts where
name Not Like '%ocean%'
and population != 'None'
order by population asc, area asc
Limit 20;
* sqlite:///factbook.db Done.
name | population | area_land |
---|---|---|
Antarctica | 0 | 280000 |
Pitcairn Islands | 48 | 47 |
Cocos (Keeling) Islands | 596 | 14 |
Holy See (Vatican City) | 842 | 0 |
Niue | 1190 | 260 |
Tokelau | 1337 | 12 |
Christmas Island | 1530 | 135 |
Svalbard | 1872 | 62045 |
Norfolk Island | 2210 | 36 |
Falkland Islands (Islas Malvinas) | 3361 | 12173 |
Montserrat | 5241 | 102 |
Saint Pierre and Miquelon | 5657 | 242 |
Saint Barthelemy | 7237 | None |
Saint Helena, Ascension, and Tristan da Cunha | 7795 | 122 |
Nauru | 9540 | 21 |
Cook Islands | 9838 | 236 |
Tuvalu | 10869 | 26 |
Wallis and Futuna | 15613 | 142 |
Akrotiri | 15700 | None |
Dhekelia | 15700 | None |
The above list sheds light on countries with lowest population. Pitcairn Islands has population of only 48. Few other interesting observations:
Checking Average Population growth
%%sql
select round(avg(population_growth),3)
from facts
where name != 'World';
* sqlite:///factbook.db Done.
round(avg(population_growth),3) |
---|
1.201 |
The average population growth is 1.201
%%sql
select name,max(population_growth) as 'Population growth' from facts;
* sqlite:///factbook.db Done.
name | Population growth |
---|---|
South Sudan | 4.02 |
%%sql
select name, population_growth
from facts
where name !='World'
order by population_growth desc
limit 20;
* sqlite:///factbook.db Done.
name | population_growth |
---|---|
South Sudan | 4.02 |
Malawi | 3.32 |
Burundi | 3.28 |
Niger | 3.25 |
Uganda | 3.24 |
Qatar | 3.07 |
Burkina Faso | 3.03 |
Mali | 2.98 |
Cook Islands | 2.95 |
Iraq | 2.93 |
Ethiopia | 2.89 |
Zambia | 2.88 |
Western Sahara | 2.82 |
Gaza Strip | 2.81 |
Tanzania | 2.79 |
Angola | 2.78 |
Benin | 2.78 |
Togo | 2.69 |
Guinea | 2.63 |
Cameroon | 2.59 |
It is very surprising to see South Sudan has highest population growth. We will try to check few ratios first being Water area to land area ratio.
%%sql
select name, area_land, area_water,
Round(cast(area_water as float)/cast(area_land as float),3) as 'Ratio'
from facts
where Ratio >= 1;
* sqlite:///factbook.db Done.
name | area_land | area_water | Ratio |
---|---|---|---|
British Indian Ocean Territory | 60 | 54340 | 905.667 |
Virgin Islands | 346 | 1564 | 4.52 |
British Indian Ocean Territory and Virgin Islands are the only two countries with water to land ratio more than 1 . British Indian Ocean Territory stands out with staggering 905.7 water to land ratio.
Now lets move on to checking countries with high death rate.
%%sql
select name, death_rate, birth_rate
from facts where
death_rate > birth_rate
and name != 'World'
order by death_rate desc;
* sqlite:///factbook.db Done.
name | death_rate | birth_rate |
---|---|---|
Ukraine | 14.46 | 10.72 |
Bulgaria | 14.44 | 8.92 |
Latvia | 14.31 | 10.0 |
Lithuania | 14.27 | 10.1 |
Russia | 13.69 | 11.6 |
Serbia | 13.66 | 9.08 |
Belarus | 13.36 | 10.7 |
Hungary | 12.73 | 9.16 |
Moldova | 12.59 | 12.0 |
Estonia | 12.4 | 10.51 |
Croatia | 12.18 | 9.45 |
Romania | 11.9 | 9.14 |
Germany | 11.42 | 8.47 |
Slovenia | 11.37 | 8.42 |
Greece | 11.09 | 8.66 |
Portugal | 11.02 | 9.27 |
Czech Republic | 10.34 | 9.63 |
Italy | 10.19 | 8.74 |
Poland | 10.19 | 9.74 |
Bosnia and Herzegovina | 9.75 | 8.87 |
Saint Pierre and Miquelon | 9.72 | 7.42 |
Japan | 9.51 | 7.93 |
Austria | 9.42 | 9.41 |
Monaco | 9.24 | 6.65 |
Wow!! it's surprising to see this list dominated by European countries. Only two "countries" are NOT part of Europe: "Saint Pierre and Miquelon" and Japan. Amazed! Read more about High death rate in Europe
Checking Average birth rate Average death rate
%%sql
select round(avg(birth_rate),3) as 'Average Birth rate',
round(avg(death_rate),3) as 'Average Death rate'
from facts
where name != 'World';
* sqlite:///factbook.db Done.
Average Birth rate | Average Death rate |
---|---|
19.332 | 7.821 |
%%sql
select name as country, birth_rate from facts
ORDER BY birth_rate DESC
Limit 20;
* sqlite:///factbook.db Done.
country | birth_rate |
---|---|
Niger | 45.45 |
Mali | 44.99 |
Uganda | 43.79 |
Zambia | 42.13 |
Burkina Faso | 42.03 |
Burundi | 42.01 |
Malawi | 41.56 |
Somalia | 40.45 |
Angola | 38.78 |
Mozambique | 38.58 |
Afghanistan | 38.57 |
Nigeria | 37.64 |
Ethiopia | 37.27 |
Sierra Leone | 37.03 |
South Sudan | 36.91 |
Chad | 36.6 |
Tanzania | 36.39 |
Cameroon | 36.17 |
Benin | 36.02 |
Congo, Republic of the | 35.85 |
%%sql
select name, round(population/area,3) as 'Population_Ratio'
from facts where name != 'World'
order by Population_Ratio desc
Limit 25;
* sqlite:///factbook.db Done.
name | Population_Ratio |
---|---|
Macau | 21168.0 |
Monaco | 15267.0 |
Singapore | 8141.0 |
Hong Kong | 6445.0 |
Gaza Strip | 5191.0 |
Gibraltar | 4876.0 |
Bahrain | 1771.0 |
Maldives | 1319.0 |
Malta | 1310.0 |
Bermuda | 1299.0 |
Sint Maarten | 1167.0 |
Bangladesh | 1138.0 |
Guernsey | 847.0 |
Jersey | 838.0 |
Barbados | 675.0 |
Mauritius | 656.0 |
Taiwan | 650.0 |
Aruba | 623.0 |
Lebanon | 594.0 |
Saint Martin | 588.0 |
San Marino | 541.0 |
Korea, South | 492.0 |
Rwanda | 480.0 |
West Bank | 475.0 |
Nauru | 454.0 |
Macau and Monaco are most densly populated countries.
The CIA World Factbook is a collection of demographic, geographical and other related facts of countries around the world. The data analysis led to many insights. Eventhough China and India are the most populous countries but they are not in the list of top 20 countries with highest population growth. There are many countries with huge land area but are sparsely populated.
Many African countries are projected to have a larger population over the coming years due to high birth rate. On the other hand, many European countries are suffering from a high death rate.