For this project, we'll analyze data from CIA World Factbook, a website that provides information on the history, people and society, government, economy, energy, geography, communications, transportation, military, and transnational issues.
Connecting the notebook to the database file:
%%capture %load_ext sql %sql sqlite:///factbook.db
To return information about the tables in the database:
%%sql SELECT * FROM sqlite_master WHERE type='table';
|table||sqlite_sequence||sqlite_sequence||3||CREATE TABLE sqlite_sequence(name,seq)|
|table||facts||facts||47||CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float)|
We'll analyze the facts table, which has some of the following columns:
The first 5 rows from the facts table:
%%sql SELECT * FROM facts LIMIT 5;
The first few rows of the facts table give some geographical information of countries; like area_land, population, population growth,migration rate, etc.
Here, we'll draw some simple descriptions from the data, like:
%%sql SELECT MIN(population) AS FROM facts;
(sqlite3.OperationalError) near "FROM": syntax error [SQL: SELECT MIN(population) AS FROM facts;] (Background on this error at: http://sqlalche.me/e/13/e3q8)
To establish the countries with the least and highest population and population growths:
%%sql SELECT name AS Country_with_smallest_population, MIN(population) AS population FROM facts;
%%sql SELECT name AS Country_with_largest_population, MAX(population) AS population FROM facts;
To view the records of 'World' from the table:
%%sql SELECT * FROM facts WHERE name = "World";
This record gives the information of the whole world, and since 'World' in itself is not a country, we'll exclude the record from the preceeding analyses
Finishing off by running a single query that returns the minimum and maximum population and population growth(excluding world):
%%sql SELECT MIN(population) AS Smallest_Population, MAX(population) AS Largest_Population, MIN(population_growth) AS Least_Population_growth, MAX(population_growth) AS Highest_Population_growth FROM facts WHERE name NOT IN ("World");
Country with the largest population
To view the country with the largest population from the table:
%%sql SELECT name AS Country, MAX(population) AS Population, population_growth FROM facts WHERE name NOT IN ('World');
China has the largest population but also has a shockingly low population growth.
China's high population started after the World War II; fmilies were encouraged to have as many children as possible in hopes of bringing more money to the country, building a better army, and producing more food. However, the one-child policy was later instituted to minimise the population.
Country with highest population growth
%%sql SELECT name AS Country, population, MAX(population_growth) AS Population_growth FROM facts;
Countries with the smallest population growth
%%sql SELECT name AS Country, population, population_growth FROM facts WHERE population_growth = 0.0 ORDER BY population DESC;
|Holy See (Vatican City)||842||0.0|
|Cocos (Keeling) Islands||596||0.0|
We'll first get the average birth rate of all the countries, rounded off to 4 decimal places.
%%sql SELECT ROUND(AVG(birth_rate), 4) AS Average_Birth_Rate FROM facts;
Country with the highest birth rate
%%sql SELECT name AS Country, MAX(birth_rate) AS Birth_rate FROM facts;
Niger has a very high birth rate, which is even twice higher than the average birth rate of all the countries.
This high birth rate is attributed to by the people's desire to have large families, and partly by gender inequality which includes lack of educational opportunities for women, leading to early marriages and childbirth.
This information can be found here
Country with the lowest birth rate
%%sql SELECT name AS Country, MIN(birth_rate) AS Birth_Rate FROM facts;
The minimum , maximum and average death rates of the countries rounded off to 4 decimal places:
%%sql SELECT MIN(death_rate) AS Minimum_death_rate, MAX(death_rate) AS Maximum_death_rate, ROUND(AVG(death_rate), 4) AS Average_death_rate FROM facts;
Here, we'll find the country with the highest migration rate
%%sql SELECT name AS Country, MAX(migration_rate) AS Migration_rate FROM facts;
Most of Qatar's immigrants are people in search of employment.
Over 80% of Qatar's workforce is made up of immigrants.
Country with the most area under water
%%sql SELECT name Country, MAX(area_water) AS area_water FROM facts;
Average value for population and area
%%sql SELECT AVG(population), AVG(area) FROM facts;
To identify countries that are densely populated, we'll check the countries that have population above the average and area below the average:
%%sql SELECT name AS Country FROM facts WHERE population > 62094928.32231405 AND area < 555093.546184739;
Country with population density
To obtain the population density of a country, we divide the population by area.
We'll therefore check the top 10 countries with the highest population density.
%%sql SELECT name AS country, population/area AS population_density FROM facts WHERE name NOT IN ('World') ORDER BY population/area DESC LIMIT 10;
SQL makes it easy to interact with databases.
We've been able to draw some useful insights from the facts table, like the maximum and minimum population, population growth and birth rate, among others.
We've also managed to perform some simple calculations like getting the population densities of the countries and several averages with ease.