The term population is often used to refer to the total number of humans currently living on the earth.There are several factors affecting population growth in different regions of the world hence the huge difference in population between different countries.The primary factors affecting population include:birth rate, death rate'and migration.They account for how much a population is increasing or decreasing.
My aim is to provide the bureau with the required insights.
To achieve this goal we are going to use python libraries like
pandas
,numpy
andmatplotlib
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
world_pop_data = pd.read_csv("world_population.xls")
world_pop_data = pd.DataFrame(world_pop_data)
world_pop_data.head(10)
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | af | Afghanistan | 652230.0 | 652230.0 | 0.0 | 32564342.0 | 2.32 | 38.57 | 13.89 | 1.51 |
1 | 2 | al | Albania | 28748.0 | 27398.0 | 1350.0 | 3029278.0 | 0.30 | 12.92 | 6.58 | 3.30 |
2 | 3 | ag | Algeria | 2381741.0 | 2381741.0 | 0.0 | 39542166.0 | 1.84 | 23.67 | 4.31 | 0.92 |
3 | 4 | an | Andorra | 468.0 | 468.0 | 0.0 | 85580.0 | 0.12 | 8.13 | 6.96 | 0.00 |
4 | 5 | ao | Angola | 1246700.0 | 1246700.0 | 0.0 | 19625353.0 | 2.78 | 38.78 | 11.49 | 0.46 |
5 | 6 | ac | Antigua and Barbuda | 442.0 | 442.0 | 0.0 | 92436.0 | 1.24 | 15.85 | 5.69 | 2.21 |
6 | 7 | ar | Argentina | 2780400.0 | 2736690.0 | 43710.0 | 43431886.0 | 0.93 | 16.64 | 7.33 | 0.00 |
7 | 8 | am | Armenia | 29743.0 | 28203.0 | 1540.0 | 3056382.0 | 0.15 | 13.61 | 9.34 | 5.80 |
8 | 9 | as | Australia | 7741220.0 | 7682300.0 | 58920.0 | 22751014.0 | 1.07 | 12.15 | 7.14 | 5.65 |
9 | 10 | au | Austria | 83871.0 | 82445.0 | 1426.0 | 8665550.0 | 0.55 | 9.41 | 9.42 | 5.56 |
Dropping a few rows and columns which are not helpful in my analysis
# Finding the original number of rows
print("Original number of rows",world_pop_data.shape[0])
# finding the original number of columns
print("Original number of columns:",world_pop_data.shape[1])
Original number of rows 261 Original number of columns: 11
#dropping the id column
del world_pop_data["id"]
world_pop_data
#confirming deletion of the id column
print("New number of columns:",world_pop_data.shape[1])
New number of columns: 10
#dropping rows where the area has a zero
world_pop_data = world_pop_data[(world_pop_data["area_land"] > 0.0) & (world_pop_data["population"] > 0.0)]
# dropping rows where the population column has a null value
world_pop_data = world_pop_data[world_pop_data["population"].notnull()]
world_pop_data
code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | |
---|---|---|---|---|---|---|---|---|---|---|
0 | af | Afghanistan | 652230.0 | 652230.0 | 0.0 | 32564342.0 | 2.32 | 38.57 | 13.89 | 1.51 |
1 | al | Albania | 28748.0 | 27398.0 | 1350.0 | 3029278.0 | 0.30 | 12.92 | 6.58 | 3.30 |
2 | ag | Algeria | 2381741.0 | 2381741.0 | 0.0 | 39542166.0 | 1.84 | 23.67 | 4.31 | 0.92 |
3 | an | Andorra | 468.0 | 468.0 | 0.0 | 85580.0 | 0.12 | 8.13 | 6.96 | 0.00 |
4 | ao | Angola | 1246700.0 | 1246700.0 | 0.0 | 19625353.0 | 2.78 | 38.78 | 11.49 | 0.46 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
245 | rq | Puerto Rico | 13791.0 | 8870.0 | 4921.0 | 3598357.0 | 0.60 | 10.86 | 8.67 | 8.15 |
246 | vq | Virgin Islands | 1910.0 | 346.0 | 1564.0 | 103574.0 | 0.59 | 10.31 | 8.54 | 7.67 |
250 | gz | Gaza Strip | 360.0 | 360.0 | 0.0 | 1869055.0 | 2.81 | 31.11 | 3.04 | 0.00 |
253 | we | West Bank | 5860.0 | 5640.0 | 220.0 | 2785366.0 | 1.95 | 22.99 | 3.50 | 0.00 |
254 | wi | Western Sahara | 266000.0 | 266000.0 | 0.0 | 570866.0 | 2.82 | 30.24 | 8.34 | NaN |
232 rows × 10 columns
#dropping countries with more than two billion people
world_pop_data = world_pop_data[world_pop_data["population"] < 2000000000]
#confirming the new number of rows
world_pop_data.shape[0]
232
From the population above we can confirm that 21 countries had a zero entry in the popoulation
column.This may be as a result of error during data entry.
# finding the countries with the highest population
world_pop_data.sort_values("population",ascending = False,inplace = True)
country_pop = world_pop_data[["name","population"]].head(10)
country_pop
name | population | |
---|---|---|
36 | China | 1.367485e+09 |
76 | India | 1.251696e+09 |
185 | United States | 3.213689e+08 |
77 | Indonesia | 2.559937e+08 |
23 | Brazil | 2.042598e+08 |
131 | Pakistan | 1.990858e+08 |
128 | Nigeria | 1.815621e+08 |
13 | Bangladesh | 1.689577e+08 |
142 | Russia | 1.424238e+08 |
84 | Japan | 1.269197e+08 |
# visualize the highest population
plt.bar(country_pop["name"],country_pop["population"])
plt.xticks(rotation=90)
plt.xlabel("countries")
plt.ylabel("population")
plt.title("bar_graph_for_countries_with_the_highest_population")
plt.show()
From the data above China is the most populated country in the whole world.China's population has continued to grow due to a large border and continued modernisation that has seen an increase in living standards and immigration as well as a decrease in infant mortality rates.To read more about this access the link below click here
Pitcairn Islands is the country in the world with the lowest populationof 48.The Pitcairn islands group is a British oversees Territory.It comprises the islands of pitcairn,Henderson,Ducie and Oeno.Pitcairn is the only inhabited island ,is a small volcanic outcrop situated in the South Pacific at Lattitude 25.04 south and longitude 130.06 west.Not very many people can survive in this area due to its unfavourable climate and living conditions for human survival.The island also attracts so many tourist due to its uniqueness and special features.Unfortunately not many would consider residing in this place For more information click on the link below; click here
#Calculating population density
world_pop_data['population_density'] = world_pop_data["population"].copy()/ world_pop_data["area"].copy()
#Population more than 45 million
pop_over_45 =world_pop_data[world_pop_data["population"]>45000000]
#creating a new columnshowing the population density
pop_over_45.sort_values("population_density",inplace = True)
pop_over_45[["name","population","population_density","area_land"]]
C:\Users\HP\anaconda3\lib\site-packages\pandas\util\_decorators.py:311: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return func(*args, **kwargs)
name | population | population_density | area_land | |
---|---|---|---|---|
142 | Russia | 1.424238e+08 | 8.329732 | 16377742.0 |
23 | Brazil | 2.042598e+08 | 23.986065 | 8358140.0 |
185 | United States | 3.213689e+08 | 32.703724 | 9161966.0 |
39 | Congo, Democratic Republic of the | 7.937514e+07 | 33.850722 | 2267048.0 |
37 | Colombia | 4.673673e+07 | 41.036366 | 1038700.0 |
160 | South Africa | 5.367556e+07 | 44.029205 | 1214470.0 |
78 | Iran | 8.182427e+07 | 49.644775 | 1531595.0 |
171 | Tanzania | 5.104588e+07 | 53.885656 | 885800.0 |
113 | Mexico | 1.217368e+08 | 61.972286 | 1943945.0 |
87 | Kenya | 4.592530e+07 | 79.131482 | 569140.0 |
27 | Burma | 5.632021e+07 | 83.242739 | 653508.0 |
52 | Egypt | 8.848740e+07 | 88.359275 | 995450.0 |
162 | Spain | 4.814613e+07 | 95.269078 | 498980.0 |
178 | Turkey | 7.941427e+07 | 101.350332 | 769632.0 |
60 | France | 6.655377e+07 | 103.376301 | 640427.0 |
172 | Thailand | 6.797640e+07 | 132.476623 | 510890.0 |
77 | Indonesia | 2.559937e+08 | 134.410291 | 1811569.0 |
36 | China | 1.367485e+09 | 142.491517 | 9326410.0 |
128 | Nigeria | 1.815621e+08 | 196.545081 | 910768.0 |
82 | Italy | 6.185512e+07 | 205.266875 | 294140.0 |
64 | Germany | 8.085441e+07 | 226.468980 | 348672.0 |
131 | Pakistan | 1.990858e+08 | 250.078002 | 770875.0 |
184 | United Kingdom | 6.408822e+07 | 263.077140 | 241930.0 |
191 | Vietnam | 9.434884e+07 | 284.861070 | 310070.0 |
84 | Japan | 1.269197e+08 | 335.841814 | 364485.0 |
137 | Philippines | 1.009984e+08 | 336.661253 | 298170.0 |
76 | India | 1.251696e+09 | 380.771354 | 2973193.0 |
90 | Korea, South | 4.911520e+07 | 492.531047 | 96920.0 |
13 | Bangladesh | 1.689577e+08 | 1138.069143 | 130170.0 |
Banglsdesh is the country with the highest population density of 1138 people per square kilometer of land.Bangladesh is in the lower part of Indo_Gangetic belt .One of the main reason for high population is vthat it is a very fertile region.Secondly it has one of the highest population growth rate.South Korea is second most densly populated country in the world,folloewd by India and Philippines.
#calculating most densely populated country
world_pop_data.sort_values("population_density",ascending = False,inplace = True)
country_pop = world_pop_data[["name","population","population_density","area_land"]].dropna()
country_pop.head()
name | population | population_density | area_land | |
---|---|---|---|---|
204 | Macau | 592731.0 | 21168.964286 | 28.0 |
116 | Monaco | 30535.0 | 15267.500000 | 2.0 |
155 | Singapore | 5674472.0 | 8141.279770 | 687.0 |
203 | Hong Kong | 7141106.0 | 6445.041516 | 1073.0 |
250 | Gaza Strip | 1869055.0 | 5191.819444 | 360.0 |
Macau is the country with the highest population density in the world with 21168 people per square kilometer of land ,followed by big gap Monaco which in turn is followed by a big gap Singapore .In general ,as we can see ,these high values are mostly related to small countries and islands with area land below the average (553017.2km2).If you check their population it is also below the average (30641,707 people)
grace=world_pop_data.sort_values('population_density',ascending=False)
most_dense_coun=world_pop_data.sort_values('population_density',ascending=False)
grace.describe()
area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | population_density | |
---|---|---|---|---|---|---|---|---|---|
count | 2.300000e+02 | 2.320000e+02 | 229.000000 | 2.320000e+02 | 230.000000 | 223.000000 | 223.000000 | 219.000000 | 230.000000 |
mean | 5.661912e+05 | 5.530172e+05 | 19540.545852 | 3.064171e+07 | 1.189000 | 19.169238 | 7.808161 | 3.412283 | 428.456288 |
std | 1.782348e+06 | 1.698552e+06 | 91960.199798 | 1.265954e+08 | 0.880425 | 9.377402 | 2.906321 | 4.407241 | 1889.850615 |
min | 2.000000e+00 | 2.000000e+00 | 0.000000 | 4.800000e+01 | 0.000000 | 6.650000 | 1.530000 | 0.000000 | 0.026653 |
25% | 2.322750e+03 | 2.498250e+03 | 0.000000 | 3.435062e+05 | 0.430000 | 11.575000 | 5.875000 | 0.355000 | 32.990473 |
50% | 6.998650e+04 | 7.066000e+04 | 620.000000 | 5.219556e+06 | 1.040000 | 16.460000 | 7.420000 | 1.880000 | 83.892339 |
75% | 3.532665e+05 | 3.700755e+05 | 7200.000000 | 1.807358e+07 | 1.862500 | 24.260000 | 9.440000 | 4.945000 | 208.664451 |
max | 1.709824e+07 | 1.637774e+07 | 891163.000000 | 1.367485e+09 | 3.320000 | 45.450000 | 14.890000 | 22.390000 | 21168.964286 |
#calculating the less densely populated country
country_pop.tail()
name | population | population_density | area_land | |
---|---|---|---|---|
117 | Mongolia | 2992908.0 | 1.913482 | 1553556.0 |
237 | Pitcairn Islands | 48.0 | 1.021277 | 47.0 |
231 | Falkland Islands (Islas Malvinas) | 3361.0 | 0.276103 | 12173.0 |
223 | Svalbard | 1872.0 | 0.030172 | 62045.0 |
206 | Greenland | 57733.0 | 0.026653 | 2166086.0 |
grace['land_water'] = grace['area_land']/grace['area_water']
more_water =grace[grace['land_water']> 1]
more_water[['land_water','population','name','area_water','area_land']]
land_water | population | name | area_water | area_land | |
---|---|---|---|---|---|
204 | inf | 592731.0 | Macau | 0.0 | 28.0 |
116 | inf | 30535.0 | Monaco | 0.0 | 2.0 |
155 | 68.700000 | 5674472.0 | Singapore | 10.0 | 687.0 |
203 | 30.657143 | 7141106.0 | Hong Kong | 35.0 | 1073.0 |
250 | inf | 1869055.0 | Gaza Strip | 0.0 | 360.0 |
... | ... | ... | ... | ... | ... |
237 | inf | 48.0 | Pitcairn Islands | 0.0 | 47.0 |
231 | inf | 3361.0 | Falkland Islands (Islas Malvinas) | 0.0 | 12173.0 |
223 | inf | 1872.0 | Svalbard | 0.0 | 62045.0 |
127 | 4222.333333 | 18045729.0 | Niger | 300.0 | 1266700.0 |
34 | 50.774194 | 11631456.0 | Chad | 24800.0 | 1259200.0 |
228 rows × 5 columns
Birth rates is one of the primary factors affecting population of given countries across the world.
#sorting the birth rate of the world's population
world_pop_data.sort_values("birth_rate",inplace=True)
birth_df=world_pop_data[["name","birth_rate"]].head(10)
birth_df
name | birth_rate | |
---|---|---|
116 | Monaco | 6.65 |
213 | Saint Pierre and Miquelon | 7.42 |
84 | Japan | 7.93 |
3 | Andorra | 8.13 |
90 | Korea, South | 8.19 |
155 | Singapore | 8.27 |
157 | Slovenia | 8.42 |
195 | Taiwan | 8.47 |
64 | Germany | 8.47 |
148 | San Marino | 8.63 |
# visualize countries birth rate
plt.bar(birth_df["name"],birth_df["birth_rate"])
plt.xticks(rotation=90)
plt.xlabel("countries")
plt.ylabel("birth_rate")
plt.title("bar graph of countries with the lowest birth rate")
plt.show()
Monaco is the country in the world with the lowest population growth .This is also because Monaco large majority of the population is urbanely centered and access to primary health care and education is available to children .Low birth rate is also observed in Asian countries like Japan ,South Korea,Singapore,Taiwan of upto 8 births per 1000 persons.
#determining the country with the highest birth rate
world_pop_data.sort_values("birth_rate",ascending=False,inplace=True)
birth_top=world_pop_data[["name","population","birth_rate"]].head(10)
birth_top
name | population | birth_rate | |
---|---|---|---|
127 | Niger | 18045729.0 | 45.45 |
108 | Mali | 16955536.0 | 44.99 |
181 | Uganda | 37101745.0 | 43.79 |
193 | Zambia | 15066266.0 | 42.13 |
26 | Burkina Faso | 18931686.0 | 42.03 |
28 | Burundi | 10742276.0 | 42.01 |
105 | Malawi | 17964697.0 | 41.56 |
159 | Somalia | 10616380.0 | 40.45 |
4 | Angola | 19625353.0 | 38.78 |
120 | Mozambique | 25303113.0 | 38.58 |
# visualize countries birth rate
plt.bar(birth_top["name"],birth_top["birth_rate"])
plt.xticks(rotation=90)
plt.xlabel("countries")
plt.ylabel("birth_rate")
plt.title("bar graph of countries with the highest birth rate")
plt.show()
Niger is the country with the high birth rate of upto 45 births per 1000 persons.Most interesting part is that among the top 20 countries with high birth rates 19 are in Africa.The main cause of high birth rates in African countries is high fertility which is driven by multiple factors such as high desired family size ,low levels of use of modern contraceptives and high levels of adolescent child bearing.Other than African countries high birth rate is also seen in Afghanistan with the rate of 39 births per 1000 persons which is due to complicated pregnancies,inaccessibilty to primary health care services ,insufficient number of health workers,early marriages ,insecurities,poverty and unemployment.The high birth rate in Afghastan is counted as major factor in children and maternal mortality rate in Afghanistan.
#determinig the country with the highest death rates
world_pop_data.sort_values("death_rate",ascending=False,inplace=True)
death_df=world_pop_data[["name","population","death_rate"]].head(10)
death_df
name | population | death_rate | |
---|---|---|---|
97 | Lesotho | 1947701.0 | 14.89 |
182 | Ukraine | 44429471.0 | 14.46 |
25 | Bulgaria | 7186893.0 | 14.44 |
70 | Guinea-Bissau | 1726170.0 | 14.33 |
95 | Latvia | 1986705.0 | 14.31 |
34 | Chad | 11631456.0 | 14.28 |
101 | Lithuania | 2884433.0 | 14.27 |
121 | Namibia | 2212307.0 | 13.91 |
0 | Afghanistan | 32564342.0 | 13.89 |
33 | Central African Republic | 5391539.0 | 13.80 |
# visualize countries death rate
plt.bar(death_df["name"],death_df["death_rate"])
plt.xticks(rotation=90)
plt.xlabel("countries")
plt.ylabel("death_df")
plt.title("bar graph of countries with the highest death rate")
plt.show()
Lesotho is the country with the country with high death rates of upto 14 deaths per 1000perssons every year .The high rate in Lesotho may be as a result of seversl factors including;the effect of AIDS epidermic in Lesotho,tendancy of parents to underreport child deaths . High death rates is also reported in Ukraine (A factor contributing to the relatively high death rate is a high mortality rate among working_age matesfrom preventable cuases such as alcohol poisoning and smoking).here is a picture of Lesotho For more information click here
#country with the lowest death rate
world_pop_data.sort_values("death_rate",inplace=True)
death_low=world_pop_data[["name","population","death_rate"]].head(20)
death_low
name | population | death_rate | |
---|---|---|---|
140 | Qatar | 2194817.0 | 1.53 |
183 | United Arab Emirates | 5779760.0 | 1.97 |
92 | Kuwait | 2788534.0 | 2.18 |
12 | Bahrain | 1346613.0 | 2.69 |
250 | Gaza Strip | 1869055.0 | 3.04 |
240 | Turks and Caicos Islands | 50280.0 | 3.10 |
150 | Saudi Arabia | 27752316.0 | 3.33 |
130 | Oman | 3286936.0 | 3.36 |
155 | Singapore | 5674472.0 | 3.43 |
253 | West Bank | 2785366.0 | 3.50 |
24 | Brunei | 429646.0 | 3.52 |
99 | Libya | 6411776.0 | 3.58 |
244 | Northern Mariana Islands | 52344.0 | 3.71 |
79 | Iraq | 37056169.0 | 3.77 |
85 | Jordan | 8117564.0 | 3.79 |
158 | Solomon Islands | 622469.0 | 3.85 |
107 | Maldives | 393253.0 | 3.89 |
169 | Syria | 17064854.0 | 4.00 |
188 | Vanuatu | 272264.0 | 4.09 |
110 | Marshall Islands | 72191.0 | 4.21 |
# visualize countries death rate
plt.bar(death_low["name"],death_low["death_rate"])
plt.xticks(rotation=90)
plt.xlabel("countries")
plt.ylabel("death_df")
plt.title("bar graph of countries with the lowest death rate")
plt.show()
Qatar is the country in the world with low death rates of upto 2 deaths per 1000 persons every year .The lowest death rates are mostly related to the Middle East countries which are historically categorized by a very high standard of living.For more information click here
In this project we have analysed various demographic and geographic statistics for all countries in the world.Below are our findings:
increase.