Skip to content

Course website for DATA 23700 1 (Winter 2025) Visualization for Data Science

Notifications You must be signed in to change notification settings

kalealex/data237-wi25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DATA 23700

This course, the data visualization offering for students in the Data Science major, helps students build core competancies for communicating with data, including generating visualizations programmatically, writing about design and analysis choices, and learning principles and procedures for rigorous work in data science. DATA 23700 introduces students to visualization design, including theoretical frameworks for reasoning about chart construction, perceptual principles, and considerations for use of color, mapping, making data interactive, and conveying uncertainty. DATA 23700 also requires students to engage various other skill sets important in data science such as technical reading and writing, data wrangling, statistical modeling, storytelling, and producing shareable, reproducible analysis notebooks.

Completion of or placement out of DATA 11900 or CMSC 14100 is a prerequisite for taking this course. Additionally, DATA 12000 and DATA 21300 should either be completed prior to registration or taken concurrently.

Students are expected to enter the course with solid foundations in both programming and statistical modeling. At a minimum, this means that students should be comfortable writing basic algorithms and data wrangling in Python, and students should understand the mathematics behind basic regression models. Additionally, students should be comfortable picking up new programming languages as needed for coursework (e.g., installing, setting up, and self-teaching basic syntax in R or JavaScript). Students who feel unprepared to meet these requirements may find DATA 22700 (i.e., the data visualization offering for Data Science minors) a better fit. Regardless of previous preparation, both creating graphics and working with data involve many difficulties that must be confronted with patience and perserverence, but both skills will benefit students tremendously in a wide variety of endeavors. Students are encouraged to apply themselves and grow!

Course objectives

Upon completion of the course, students should be able to:

  1. Generate visualizations programmatically.
  2. Apply principles of perception and statistics to visualization design.
  3. Make data visualizations interactive.
  4. Avoid creating, recognize, and redesign ill-formed, ineffective, misleading, or deceptive graphics.
  5. Use computational notebooks to write cogent, reproducible analyses.
  6. Produce high-quality written analysis reports.

Communication

The primary method of communication between students and instructional staff will be email.

Instructor: Alex Kale - [email protected]

TA: Krisha Mehta - [email protected]

TA: Danni Liu - [email protected]

TA: Song Oh - [email protected]

Grader: Gio Maya [email protected]

Grader: Robert Ukrainsky [email protected]

We will not be relying on EdTech systems like EdStem. We will use Canvas only to link to course content like this GitHub repo.

Class sessions

We will hold class Tuesday and Thursday 12:30 PM - 01:50 PM in Henry Hinds Laboratory for the Geophysical Sciences (HGS) 101.

Office hours

Students seeking help with concepts or coursework should plan to attend office hours. We will hold the following office hours each week unless posted otherwise.

Day of week Time of day Person hosting Where
Tuesday 10:00 AM - 11:30 AM Krisha JCL 207
Tuesday 11:30 AM - 12:30 PM Danni JCL 205
Tuesday 2:00 PM - 3:30 PM Alex JCL 263
Wednesday 10:00 AM - 11:30 AM Krisha Zoom
Thursday 11:30 AM - 12:30 PM Danni JCL 205
Thursday 2:00 PM - 3:30 PM Alex JCL 263
Friday 11:30 AM – 1:00 PM Song JCL 236
Friday 1:00 PM - 2:00 PM Danni Zoom
Friday 3:30 PM – 5:00 PM Song JCL 236

Office hours begin on January 9 and end on March 8.

Due to conference travel, Alex will be unavailable January 6-8. Because of this, we will skip class as scheduled on January 7.

Materials, turn-in, and gradebook

All course materials can be found in the course GitHub repo (this page).

Student work will be turned in via Gradescope, where students will be able to view grades and feedback on coursework.

Schedule

Week 1: Introduction

Tuesday - No class (Alex away at conference)

Thursday - Value of data visualization

  • Assignment 1 and Project released

Week 2: Fundamentals of chart construction and working with data

Tuesday - Grammar of graphics

  • Exercise 1

Thursday - Data models, Literate programming

  • Exercise 2
  • Students will need to set up VSCode or Google Colab. Colab is recommended for students who are unfamiliar with their file system or who don't want to deal with Python installation.

Friday, Jan 17 - Exercise 1 due!

Week 3: Principles for visualization design

Monday, Jan 20 - Assignment 1 (Design) due!

Tuesday - Design process and critique

  • Exercise 3

Thursday - Perception

Friday, Jan 24 - Exercise 2 due!

Week 4: Color and cartography

Tuesday - Color

  • Assignment 2 released

Thursday - Visualizing data in maps

  • Exercise 4

Friday, Jan 31 - Exercise 3 due!

Week 5: Data interaction

Monday, Feb 3 - Exercise 5 (project check-ins) due!

Tuesday - Interaction and Animation (two part lecture)

Thursday - Making data interactive

  • Exercise 6
  • Assignment 3 released

Friday, Feb 7 - Exercise 4 due!

Week 6: Data communication

Monday, Feb 10 - Assignment 2 (Color and cartography) due!

Tuesday - Storytelling

  • Assignment 4 released

Thursay - Accessibility

Week 7: Rhetorical visualization

Monday, Feb 17 - Exercise 6 due!

Tuesday - Persuasive visualization

  • Exercise 7

Wednesday, Feb 19 - Assignment 3 (Interactive Data) due!

Thursday - Deceptive visualization

Week 8: Uncertainty visualization

Monday, Feb 24 - Assignment 4 (Narrative) due!

Tuesday - Uncertainty visualization

Thursday - Visualizing regression model outputs

  • Exercise 8

Friday, Feb 28 - Exercise 7 due!

Week 9: Visualization for model interpretability

Tuesday - Visualizations as model checks

Thursday - Visualization for machine learning interpretability

  • Exercise 9
  • Final day for resubmissions!

Friday, Mar 7 - Exercise 8 due!

Finals week

Tuesday, Mar 11 - Project due!

Friday, Mar 14 - Exercise 9 due!

Coursework

Deliverables for this course include 4 assignments, 9 exercises, and 1 project.

Assignments

Assignments are summative assessments of core learning objectives in the course. Each assignment involves analyzing and visualizing a specific dataset (often provided by the instructor). Students are expected to create orginal work; collaboration or copying from any source is not allowed. Assignments are evaluated based on both the quality of visualizations produced as well as the quality of write-ups explaining the analysis and design rationale.

Exercises

Exercises are opportunities for skill-building, practice, and collaboration. Exercises begin during class time, but students may need to spend time completing them at home. Exercises involve a few specific tasks, including reading technical specifications, writing code, and documenting work. Exercises are evaluated only on completeness.

Project

The project serves the purpose of a final in DATA 23700. The project involves choosing and analyzing dataset and producing a written technical report about the analysis. Students are expected to create orginal work; collaboration or copying from any source is not allowed. The project is evaluated on the following criteria:

  • Choice of dataset
  • Quality of analysis
  • Quality of visualizations
  • Quality of write-up

Evaluation

We use a form of grading known as specifications grading in this course. The goal of specifications grading is to help students focus on their mastery of the material and identify areas for improvement as the quarter progresses. Students are encouraged to focus on skills, not on scores.

Final grades will be determined based on assignments, exercises, and the project.

Assignments and the project

Assignments and the project will be evaluated using an S/N/U scale:

  • Satisfactory (S): The student demonstrates sufficient mastery of the material. The standard for earning a score of S is high, reflecting the intructor's expectations for student work.
  • Needs Improvement (N): The student has put in a good-faith effort to complete the work, but revealed a lack of mastery of the material that can be addressed via concrete feedback.
  • Ungradable (U): The student did not submit any work, did not follow directions, or did not complete a sufficient portion of the work assigned (e.g., completed less than half the work).

When reviewing assignments, we evaluate both:

  • Quality of visualizations including but not limited to design choices such as encodings, transformations, and scales.
  • Quality of write-ups including but not limited to clear communication, logical arguments, and cogent rationales for design and analysis choices.

When reviewing the project, we evaluate for:

  • Choice of dataset including but not limited to whether the dataset can answer the questions the student poses, support a narrative, and enable visualizations that demonstrate the skills learned throughout the course.
  • Quality of analysis including but not limited to whether analysis choices are statistically valid, whether the analysis is robust to arbitrary analysis choices, and whether analysis choices are documented, justified, and reproducible.
  • Quality of visualizations including but not limited to design choices such as encodings, transformations, and scales.
  • Quality of write-up including but not limited to the rhetorical cohesion of the write-up with the analysis, clear communication, logical arguments, and cogent rationales for design and analysis choices.

The specifications for each assignment, exercise, and project include more precise descriptions of what is expected for a score of Satisfactory.

There are a total of 8 S/N/U scores for assignments. Every assignment has an S/N/U score assigned for quality of visualization and quality of write-up.

There are a total of 4 S/N/U scores for the project, assigned for choice of dataset, quality of analysis, quality of visualizations, and quality of write-up.

Exercises

Exercises provide a participation grade. Exercises will be graded only on completion and will receive a score of either Satisfactory (S) or Ungradable (U). Students will not have the option of earning a Needs improvement (N) on exercises.

There are a total of 9 S/U scores for exercises, one per exercise.

Final grades

In total, students receive 21 S/N/U scores, with only 12 N scores possible.

Final grades are based on the following table. The number of Satisfactory scores determines a student's letter grade. The number of Needs improvement scores, plus any additional Satisfactory scores beyond those needed for a given letter grade, determine plus and minus within each letter grade.

Minimum S Required Minimum N (or additional S) Required Final Grade
21 0 A
19 2 A-
17 4 B+
17 2 B
17 0 B-
15 4 C+
15 2 C
15 0 C-
13 2 D+
13 0 D
12 and below NA F

Consider some examples:

  • A student with 21 S, 0 N, and 0 U would get an A
  • A student with 20 S, 1 N, and 0 U would get an A-
  • A student with 18 S, 3 N, and 0 U would get a B+
  • A student with 19 S, 1 N, and 1 U would get an B
  • A student with 17 S, 1 N, and 3 U would get a B-
  • A student with 15 S, 1 N, and 5 U would get a C-
  • A student with 14 S, 5 N, and 2 U would get a D+
  • A student with 11 S, 9 N, and 1 U would get an F

Advice for success: The grading scheme in this class emphasizes completing all coursework. It is much better to turn in work that is imperfect than to turn in nothing at all. Students can earn a healthy buffer of Satisfactory scores with relative ease by turning in all exercises on time.

Resubmissions

Assignments only (not exercises or the project) may be resubmitted one time each. If a student receives a score of N or U on an assignment, they may resubmit that assignment one time for a score bump of up to one increment. In other words, a resubmission can raise a N to a S or raise a U to a N, but a resubmission cannot raise a U to a S. Thus, students should always make an earnest attempt to submit something rather than missing deadlines.

How to resubmit? Students wishing to resubmit an assignment should email their instructor. Students resubmitting an assignment that at first earned a U should send their instructor a new submission, in the required file format(s), within the allotted timeline (see below). Students resubmitting an assignment that at first earned a N should visit their instructor's office hours prepared to present their revised work and explain how they responded to feedback. Resubmissions that at first earned a N may not be submitted without visiting office hours.

Conditions on resubmitting:

  • Students wishing to resubmit an assignment must do so within three weeks of the assignment's due date.
  • The last day for resubmissions is Thursday March 6, and no resubmissions will be accepted after this date.
  • Students may not resubmit the same assignment more than one time.
  • A score bump upon resubmission is subject to the same standards as the original round of grading, e.g., the expectation for an S is still high. Not all resubmissions are guaranteed a score bump.
  • Resubmissions must go through the instructor, not the TAs, unless the instructor grants explicit permission otherwise.

Late policy

Late submissions are not accepted in this class, except under specific circumstances (see below). Assignments, exercises, and the project have hard due dates, which students must adhere to. These deadlines are in place to pace the workload of DATA 23700 appropriately and to provide a fair system of accountability. Please bear in mind that the grading scheme for this class (with resubmissions) is set up to absorb a reasonable amount of sub-par work. Turning in something unpolished is much better than not turning in anything at all.

Exceptions to the no late work policy:

  • Resubmissions: Students may resubmit each assignment once, within three weeks of that assignment's due date. To resubmit an assignment, students should email the instructor (see above). Resubmitted assignments can receive a score change of only one increment, i.e., U to N or N to S. Resubmission cannot be used for exercises or the project.
  • Emergencies: If you have an emergency and feel it warrants an exception to the no late work policy, you should first be in contact with your College advisor, as the College should be aware of the emergency and ensure that any proper university or department policies are followed if needed (e.g., an injury might require Student Disability Services accommodations). Emergencies entail conditions beyond the student's control that make it infeasible for them to complete coursework on time. This policy is not intended to provide relief for failures of time management. Once students have contacted their College advisor, they should contact their instructor via email with a CC to their College advisor. Your instructor does not need to know the details of your emergency, but they do need your College advisor to confirm that your situation qualifies as an emergency. Contacting us as early as is practical given the emergency makes the process of accommodating your situation work more smoothly for everyone. We care about your well-being and success in the class, and have put these policies in place to be fair and give students agency.

No other exceptions will be made.

Grade disputes

Except in very specific cases (described below), you cannot dispute the score assigned to you on a piece of work. The score you receive on a piece of work is meant to convey feedback on your level of mastery, and you should take it as an opportunity to understand the areas for improvement in your work. You are welcome to ask us for concrete advice about how to improve your work; we are always happy to have those kind of conversations with students, including going over your code or writing. On the other hand, we will not entertain requests to change your score just because you feel your work deserved a higher score.

There is one exception to the no grade disputes policy: if a grader made a factual mistake in grading your work. Please note that this only includes cases where a grader makes an erroneous statement about your code or writing in their feedback. It does not include cases where you simply disagree with whether something deserves to be flagged as incorrect.

For example, suppose you receive a piece of feedback that says "Poor choice of encoding for data type: Student used a part-to-whole representation for non-proportion data”. If the encoded data in question was actually proportion data, and the grader missed this fact (and erroneously gave you that feedback), you can ask us to review this decision. Please note that, even if the feedback is amended, it may not affect your actual SNU score depending on how many other issues were identified in your work.

We ask that you keep these requests brief and to the point: no more than a few sentences identifying the exact statement that the grader made and the reasons you believe the statement was mistaken, including references to specific parts of your code or writing (e.g., “I said that these are proportion data in paragraph 2 of the submitted report.”). Focus on laying out the facts, and nothing else. Regrade requests should be submitted through Gradescope, not via email.

Regrade requests are distinct from assignment resubmissions (see above).

Finally, it is also your responsibility to make these requests in a timely manner. Requests to review grading mistakes must be submitted no later than one week after a graded piece of work is returned to you. After that time, we will not consider any such requests, regardless of whether the request is reasonable and justified.

We will not accept any request to review grading after Thursday March 13 because grades are due soon after; this may limit grade disputes for the last two exercises and the project.

Software tools

Students in DATA 23700 are expected to take responsibility for filling in gaps in their understanding of any software they use to complete their coursework. This goes especially for students who are new to a given programming language or software tool.

Most work in this class will be conducted using computational notebooks, other text files, and graphics editing tools. We will demonstrate how to work with computational notebooks in VS Code and RStudio, but students may choose to use a different programming environment or text editor if they want (e.g., Google Colab). Similarly, we will cover a handful of APIs in Python, R, and potentially JavaScript, but students may choose which of these to use for their coursework (unless instructed to use a specific tool). We may also briefly introduce graphics editing software such as Figma or Adobe Illustrator. Students are responsible for learning to use and troubleshoot whatever software tools they adopt to complete their coursework.

In particular, course staff will not help you install software or troubleshoot environment setup issues since these problems are tangential to our learning objectives, and students who are adequately prepared for the course should be able to navigate these kinds of problems independently.

Code of conduct

Diversity and Inclusion: Students are expected to treat each other with respect and to give due consideration to each other's stances and positions. Discrimination of any kind will not be tolerated, along the lines of gender identity, sexuality, disability, generational status, socioeconomic status, ethnicity, race, religion, national origin, culture, or otherwise. Please see the UChicago Commitment on Diversity. Additionally, we are committed to providing equitable access to education at UChicago. Students who have been approved for academic accommodations through Student Disability Services (SDS) should follow the procedures established by SDS for using accommodations. Regardless of identity or status with SDS, students with concerns or questions about issues of diversity and inclusion should email the instructor.

Academic Integrity: We take academic honesty very seriously in this class. This means students must not collaborate or copy from outside sources on non-collaborative work such as assignments and the project. Collaboration is allowed on exercises only, but you must turn in your own copy of the work. Students are not permitted to use automated assistants such ChatGPT or GitHub CoPilot for coursework; use of these technologies in this class is considered an academic honesty violation. Additionally, students should be aware of the UChicago policy on Academic Honesty and Plagiarism. The gist of academic integrity is that you should always do and submit your own work.

Sexual Misconduct: Title IX prohibits discrimination on the basis of sex, including sexual assault, sexual abuse, sexual harassment, dating violence, domestic violence, and stalking. Sexual misconduct is completely unacceptable at UChicago (and anywhere else), including any interactions that occur related to this course. For related resources, please see the UChicago website about Title IX and Sexual Misconduct. Students seeking help or guidance related to an incident of sexual misconduct will be supported. Students should be aware that, in certain situations, the University may have an institutional obligation to respond to a report of sexual misconduct and that, as a faculty member, your instructor is required by Title IX and the University of Chicago to report incidents of sexual misconduct, even if students request to keep the information confidential. If you would like to speak to someone confidentially about an incident of sexual misconduct, please see the University's Confidential Resources.

Student Health and Wellness: Student's mental and physical health are of primary importance both for reasons of common human dignity and to create a suitable learning environment at UChicago. If you or someone you know needs or might benefit from mental health services, please consider reaching out to UChicago Student Wellness, whose services do not come at any additional cost to students.

If you are sick, please do not come to class or in-person office hours. If you need to miss class because of an illness, please email your instructor. Students who have been exposed to or who are experiencing symptoms of COVID-19 should contact UChicago Student Wellness to be tested. If you were potentially exposed to COVID-19 or your COVID-19 test results come back positive, please reach out immediately to [email protected]. Other public health concerns should be directed to UCAIR. If there is an emergency, please call 773-702-8181 or dial 123 on any campus phone, or call 911 for emergency response.

Attendance and Participation: DATA 23700 is 100% in-person. If students need to miss class for some reason, they should ask their peers to share notes from the missed class. We will not record lectures, however, we are happy to discuss what was missed during office hours. Please do not email the instructional staff with requests for a summary of missed lectures; we will tell you to ask a peer for notes and/or visit us at office hours. Students who miss an exercise are still expected to turn that exercise in for credit.

About

Course website for DATA 23700 1 (Winter 2025) Visualization for Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published