Week 6, November 6: Data, part III – Getting/Making Your Own Datasets
Pre-class reading
What we’ll cover
- Joining datasets with spreadsheets and database software.
- Introduction to web scraping.
- Freedom of Information requests, FOIA Machine, and MuckRock.
Post-class assignments
- Using either spreadsheets or database software, combine the Facebook government requests and users-by-country datasets. Which governments requested the most number of user accounts per Facebook user? Which countries are missing data? Are they important?
- What government agencies might have unpublished data relevant to your long-term project that you could FOIA? Pick one, and write a FOIA letter. (You don’t have to send it.)
- Find a website or online databse that might be worth scraping for your long-term project. If you feel comfortable scraping it, do so. If not, write a “pseudocode” program that describes the steps you’d hypothetically take to scrape it.