Schedule and Assignments (Spring 2020)

This schedule is subject to change. Please check back frequently.

Week Date Topics Instructor Due Dates Readings and Assignments
1 4/7 Course Overview and Introduction to the Process of Data Science Blase and Raul --
4/9 Data Acquisition and Standardization (descriptive statistics, assumptions about data formats, dirty data, missing data) Raul Reading Response 1 (4/9)
2 4/14 Data Analysis (inferential statistics, sampling, confidence intervals, hypothesis testing, pitfalls) Raul Reading Response 2 (4/13), Assignment 1 (4/15)
  • Programming Assignment 1 introduces basic data exploration, analysis, and recoding in Python.
4/16 Introduction to Machine Learning and Generalization (linear/multiple regression, goodness of fit, extrapolation, decisions) Raul --
3 4/21 Additional Methods for Machine Learning and Feature Engineering (feature engineering/model selection, classifiers, pitfalls) Raul Reading Response 3 (4/20), Assignment 2 (4/22)
  • Programming Assignment 2 introduces the difficulties of cleaning, joining, and generalizing data.
4/23 Fairness in Machine Learning and Artificial Intelligence Blase --
4 4/28 Experimental Design in Data Science (hypothesis testing, experimental design, RQ, multiple comparisons) Blase Reading Response 4 (4/27), Assignment 3 (4/29)
  • Programming Assignment 3 covers fairness issues in applied machine learning.
4/30 The Design of Respectful User Interfaces for Data (dark patterns, nudging, manipulative interfaces, respectful interfaces) Blase --
5 5/5 Responsible and Ethical Data Collection (ethics of human subjects, data subjects, sampling/recruitment, scraping) Blase Reading Response 5 (5/4), Assignment 4 (5/6)
  • Programming Assignment 4 requires students create an interface exhibiting a dark pattern and compare its efficacy relative to a control interface in an online experiment.
5/7 Responsibly Visualizing and Communicating Data Raul --
6 5/12 Privacy Philosophy, Regulation, and Policy Blase Reading Response 6 (5/11), Assignment 5 (5/13)
  • Programming Assignment 5 explores how to visualize data in responsible and irresponsible ways.
5/14 Statistical Approaches to Privacy (k-anonymity, differential privacy) Raul --
7 5/19 Anonymization and Anonymity Blase Reading Response 7 (5/18), Assignment 6 (5/20)
  • Programming Assignment 6 explores the implementation and limits of differential privacy.
5/21 Conducting and Explaining Data Analysis Using Neural Networks Blase --
8 5/26 Responsible Data Lifecycles (the right of erasure, access rights, data portability, biomedical and genetic data, web personalization) Blase Reading Response 8 (5/25), Assignment 7 (5/27)
  • Programming Assignment 7 provides hands-on experience scraping data from the web and then using that information to deanonymize a data set.
5/28 Emerging Uses and Misuses of Data (data brokers, data sharing, future uses) Raul
9 6/2 Should You Ask This Question? Should You Collect This Data? Raul Project Report (6/1), Reading Response 9 (6/1), Assignment 8 (6/3)
  • No new readings, though there still is a course-synthesis-style reading response.
  • Programming Assignment 8 requires students to simulate a database-backed application that collects data robustly while also complying with GDPR-style rights to data access and erasure.