This course is assessed through four components, each with different weight.
Coursework
Students are encouraged to contribute to the online discussion forum set up for the module. The contribution to the discussion forum is assessed as an all-or-nothing 5% of the mark that can be obtained by contributing meaninfully to the online discussion board setup for the course before the end of the first month of the course. Meaningful contributions include both questions and answers that demonstrate the student is committed to make the forum a more useful resource for the rest of the group.
Information provided on labs.
Information provided on labs.
(computational_essay)=
Here's the premise. You will take the role of a real-world data scientist tasked to explore a dataset on the city of Toronto (Canada) and find useful insights for a variety of decision-makers. It does not matter if you have never been to Toronto. In fact, this will help you focus on what you can learn about the city through the data, without the influence of prior knowledge. Furthermore, the assessment will not be marked based on how much you know about Toronto but instead about how much you can show you have learned through analysing data.
A computational essay is an essay whose narrative is supported by code and computational results that are included in the essay itself. This piece of assessment is equivalent to 2,500 words. However, this is the overall weight. Since you will need to create not only English narrative but also code and figures, here are the requirements:
matplotlib
figure)The assignment relies on two datasets provided below, and has two parts. Each of these pieces are explained with more detail below.
To complete the assignment, the following two datasets are provided. Below we show how you can download them and what they contain.
import geopandas, pandas
This dataset contains a set of polygons representing the official neighbourhoods, as well as socio-economic information attached to each neighbourhood.
You can read the main file by running:
neis = geopandas.read_file("https://darribas.org/gds_course/_downloads/a2bdb4c2a088e602c3bd6490ab1d26fa/toronto_socio-economic.gpkg")
neis.info()
<class 'geopandas.geodataframe.GeoDataFrame'> RangeIndex: 140 entries, 0 to 139 Data columns (total 24 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 _id 140 non-null int64 1 AREA_NAME 140 non-null object 2 Shape__Area 140 non-null float64 3 neighbourhood_name 140 non-null object 4 population2016 140 non-null float64 5 population_sqkm 140 non-null float64 6 pop_0-14_yearsold 140 non-null float64 7 pop_15-24_yearsold 140 non-null float64 8 pop_25-54_yearsold 140 non-null float64 9 pop_55-64_yearsold 140 non-null float64 10 pop_65+_yearsold 140 non-null float64 11 pop_85+_yearsold 140 non-null float64 12 hh_median_income2015 140 non-null float64 13 canadian_citizens 140 non-null float64 14 deg_bachelor 140 non-null float64 15 deg_medics 140 non-null float64 16 deg_phd 140 non-null float64 17 employed 140 non-null float64 18 bedrooms_0 140 non-null float64 19 bedrooms_1 140 non-null float64 20 bedrooms_2 140 non-null float64 21 bedrooms_3 140 non-null float64 22 bedrooms_4+ 140 non-null float64 23 geometry 140 non-null geometry dtypes: float64(20), geometry(1), int64(1), object(2) memory usage: 26.4+ KB
You can find more information on each of the socio-economic variables in the variable list file:
pandas.read_csv("https://darribas.org/gds_course/_downloads/8944151f1b7df7b1f38b79b7a73eb2d0/toronto_socio-economic_vars.csv")
_id | name | Category | Topic | Data Source | Characteristic | |
---|---|---|---|---|---|---|
0 | 3 | population2016 | Population | Population and dwellings | Census Profile 98-316-X2016001 | Population, 2016 |
1 | 8 | population_sqkm | Population | Population and dwellings | Census Profile 98-316-X2016001 | Population density per square kilometre |
2 | 10 | pop_0-14_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Children (0-14 years) |
3 | 11 | pop_15-24_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Youth (15-24 years) |
4 | 12 | pop_25-54_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Working Age (25-54 years) |
5 | 13 | pop_55-64_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Pre-retirement (55-64 years) |
6 | 14 | pop_65+_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Seniors (65+ years) |
7 | 15 | pop_85+_yearsold | Population | Age characteristics | Census Profile 98-316-X2016001 | Older Seniors (85+ years) |
8 | 1018 | hh_median_income2015 | Income | Income of households in 2015 | Census Profile 98-316-X2016001 | Total - Income statistics in 2015 for private ... |
9 | 1149 | canadian_citizens | Immigration and citizenship | Citizenship | Census Profile 98-316-X2016001 | Canadian citizens aged 18 and over |
10 | 1711 | deg_bachelor | Education | Highest certificate, diploma or degree | Census Profile 98-316-X2016001 | Bachelor's degree |
11 | 1713 | deg_medics | Education | Highest certificate, diploma or degree | Census Profile 98-316-X2016001 | Degree in medicine, dentistry, veterinar... |
12 | 1714 | deg_phd | Education | Highest certificate, diploma or degree | Census Profile 98-316-X2016001 | Earned doctorate |
13 | 1887 | employed | Labour | Labour force status | Census Profile 98-316-X2016001 | Employed |
14 | 1636 | bedrooms_0 | Housing | Household characteristics | Census Profile 98-316-X2016001 | No bedrooms |
15 | 1637 | bedrooms_1 | Housing | Household characteristics | Census Profile 98-316-X2016001 | 1 bedroom |
16 | 1638 | bedrooms_2 | Housing | Household characteristics | Census Profile 98-316-X2016001 | 2 bedrooms |
17 | 1639 | bedrooms_3 | Housing | Household characteristics | Census Profile 98-316-X2016001 | 3 bedrooms |
18 | 1641 | bedrooms_4+ | Housing | Household characteristics | Census Profile 98-316-X2016001 | 4 or more bedrooms |
This is a similar dataset to the Tokyo photographs we use in Block H but for the city of Toronto. It is a subsample of the 100 million Yahoo dataset that contains the location of photographs contributed to the Flickr service by its users. You can read it with:
photos = pandas.read_csv("https://darribas.org/gds_course/_downloads/fc771c3b1b9e0ee00e875bb2d293adcd/toronto_flickr_subset.csv")
photos.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2000 entries, 0 to 1999 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 id 2000 non-null int64 1 user_id 2000 non-null object 2 user_nickname 2000 non-null object 3 date_taken 2000 non-null object 4 date_uploaded 2000 non-null int64 5 title 1932 non-null object 6 longitude 2000 non-null float64 7 latitude 2000 non-null float64 8 accuracy_coordinates 2000 non-null float64 9 page_url 2000 non-null object 10 video_url 2000 non-null object dtypes: float64(3), int64(2), object(6) memory usage: 172.0+ KB
IMPORTANT - Students of ENVS563
will need to source, at least, two additional datasets relating to Toronto. You can use any dataset that will help you complete the tasks below but, if you need some inspiration, have a look at the Toronto Open Data Portal:
This is the one everyone has to do in the same way. Complete the following tasks:
For this one, you need to pick one of the following three options. Only one, and make the most of it.
Create a geodemographic classification and interpret the results. In the process, answer the following questions:
Create a regionalisation and interpret the results. In the process, answer at least the following questions:
Using the photographs, complete the following tasks:
This course follows the standard marking criteria (the general ones and those relating to GIS assignments in particular) set by the School of Environmental Sciences. In addition to these generic criteria, the following specific criteria relating to the code provided will be used: