As explained in the Before week 1 notebook, each week of this class is a Jupyter notebook like this one. In order to follow the class, you simply start reading from the top, following the instructions.
Hint: And you can ask me or your amazing TA for help at any point if you get stuck!
This first lecture will go over a few different topics to get you started:
Part 1: You picked this course in Computational Social Science but... What does that even mean?? The first thing we will do today is to learn a bit more about it by reading some chapters of the book and listen to a short lecture by me.
Part 2: In the second part of this class, I will introduce you to a topic we will work on for the rest of this course and I will ask you to reflect upon it.
Part 3: In the final part of this class, we will start working on something hands on. We will use Web scraping to gather some data.
What is Computational Social Science? In the video below, I will give you some answers. There will be a little bit of history, example of topics and datasets, an overview of the methods, and some reflections on the challenges faced by researchers, including in relation to Ethics and Privacy.
Video lecture: Watch the video below about Computational Social Science
from IPython.display import YouTubeVideo
YouTubeVideo("3dA1GYdSg-A",width=600, height=337.5)
In this course, we are going to read some parts of the amazing book by Matthew Salganik "Bit By Bit: Social Research in the Digital Age". Salganik is a professor in Sociology at Princeton, and an active researcher in Computational Social Science. You can read the book online, but I encourage you to buy it.
Reading: Bit by Bit, chapter 1 Start by reading the Introduction of the book, where you will get you an understanding of the history of the field and the general framework.
Reading 2: Bit by Bit, chapter 6 Read the Ethics chapter of the book. Here, I don't expect you to read all the details. However, I want to make sure you get an overall understanding of the ethical challenges and some of the approaches that are used in the field to deal with these complex issues. You can focus on sections 6.4 and 6.6.
Optional Reading: Computational Social Science. This is a crucial article written by some of the pioneers in Computational Social Science. The paper came out in 2009 in the prestigious scientific journal Science. People had already been working on using large-scale data and new computational tools to study society and behaviour, but the article is the first to acknowledge and describe this emerging field. I encourage you to read it or skim through it!
Exercise 1 : Topics in Computational Social Science By now, you must have a grasp of what Computational Social Science is about.
- Work in pairs. Based on what you know so far, come up with three social science topics that you think it would be interesting and possible to work on using computational social science methods.
- Do you have an idea of the data that you could use for your research? Come up with one dataset for each of the topics you have identified (if you have any doubt you can talk to me or your TA).
- One person per pair: go to DTU Learn and fill the Survey "Topics in Computational Social Science"
All right so, as I promised this course will be very hands-on. We will learn some of the methods and modelling approaches used in Computational Social Science and we will put this learning into practice. The way we will do it is that we will apply the methods we learn to study a specific topic throughout the rest of this course.
In this video, I will explain to you what the whole project will be about. Ready? Watch the video below!
Video lecture: Watch the video below
from IPython.display import YouTubeVideo
YouTubeVideo("uEJltY5Pv1U",width=600, height=337.5)
Exercise 2 : Understanding the field Computational Social Science in a data-driven way
In the following of this course, we will do a meta-study on the field of Computational Social Science. We will start by gathering data on Computational Social Science researchers, their interactions, and their scientific production.
- Work in pairs. Discuss possible data sources that you could collect to address our research question.
- Which dataset(s) would you collect?
- How would you practically collect it?
- What are some limitations of your data?
- One person per pair: go to DTU Learn and fill the Survey "Data for Computational Social Science"
Remember that in this exercise, there is not a single correct answer. There could be multiple ways to gather data, and different data sources could shed lights on different aspects of scientists' works and interactions.
In the final part of this class, we will talk about web-scraping. For web-scraping, you need a little bit of knowledge about the structure of web-pages. The standard way to write web-bages is to use a language called HTML.
Useful tutorial: If you are not familiar with HTML, I recommend reading this tutorial.
Useful resource: HTML pages are built in a hierarchical structure and are composed of elements such as tables, titles, paragraphs, sections, etc. A complete list of HTML elements can be found here.
All right, so now it's really time to start working on something a bit hands-on. The first thing we need to do is indeed to get DATA. As I said a few times by now, in this class we will do things from scratch. One of the ways to gather data from the web is to use web-scraping, which basically means getting information directly from web-pages. In the video below, I will give a brief overview on how to web-scrape web pages.
Video lecture: Watch the video below about web scraping (you can find here the notebook that I use in the video
from IPython.display import YouTubeVideo
YouTubeVideo("nK_d0UQp4cE",width=600, height=337.5)
Exercise 3 : Web-scraping the list of participants to the International Conference in Computational Social Science
It's time to put things into practice. Remember that our goal will be to gather a dataset describing Computational Social Scientists and their work. As we have discussed, the field of Computational Social Science is loosely defined. To gather data, we will start from the list of researchers that have joined the most important scientific conference in Computational Social Science in 2019, 2020 and 2021. The conference is called International Conference in Computational Social Science (IC2S2 in short). The assumption here is that the scientists who contribute to this conference are working in Computational Social Science.You can find the programmes of the 2019, 2020 and 2021 editions of the conference at the links below.
- Inspect the HTML of the pages above and use web-scraping to get the set of participants in 2019, 2020, and 2021.
- How many unique researchers you get in 2019? and in 2020? and in 2021? How many unique researchers in total?
- Create the set of unique researchers that joined at least one of these three conferences and store it into a file.
- Go to DTU Learn and fill the Survey "Web Scraping"
- Optional: Add also authors from the 2022 edition (link below).
2019 edition
Oral presentations: https://2019.ic2s2.org/oral-presentations/
Poster presentations: https://2019.ic2s2.org/posters/
Hint: elements that represent items in lists are defined by the <li> tag2020 edition
Oral and Poster presentations: https://ic2s2.mit.edu/program
Hint: Here you should find an easy way to get the data by inspecting the page (right click on the table and click Inspect)2021 edition
All presentations: https://easychair.org/smart-program/IC2S2-2021/talk_author_index.html2022 edition
All presentations: https://boothuchicagocaai.wixsite.com/website-2/program
Here the data is in a PDF.
Hint: There are Python packages to read data from PDFs.
I hope you enjoyed today's class. It would be awesome if you could spend a few minutes to share your feedback.
Go to DTU Learn and fill the Survey "Week 1 - Feedback".