Course Description

Scientists, engineers, and other technical professionals require skills in computing and data analysis to do their jobs. We refer to these as data science skills.

Examples of data science skills abound. Biologists search thousands of genomes for DNA sequences with special characteristics, such as genes that transcribe non-coding RNA that is “anti-sense” to messenger RNAs. Astronomers search, integrate, and visualize data from many instruments that produce terabytes of complex data. Social scientists do text analytics on massive repositories of social media data to distill patterns in topics and trends in sentiment.

This course teaches graduate students the software engineering skills to do research in data science fields and to be successful technical professionals in the 21st Century. In particular, this course teaches how to approach computational research with reproducibility in mind: to create sharable and reusable research projects that incorporate both computation and data.

Students will learn the following skills:

  • Developing software in a way that it can be used by others, including documentation, installing packages, automating setup, and running computational studies.
  • Creating technical specifications for what a program should do (its use cases) and how this is accomplished (software design).
  • Creating, updating, and sharing a project using version control (specifically git and GitHub) and collaborating effectively with teammates.
  • Asking good questions and solving technical problems independently using the internet, peers, and mentors.
  • Programming in python using the Python scientific stack, including numpy, pandas, and matplotlib.
  • Developing tests that validate important aspects of the project implementation, and, more broadly, using test-driven development to build software.
  • Researching, evaluating, and integrating into a project an externally developed Python packages as well as creating your own Python packages.

The course emphasizes a hands-on learning approach in which class time is often used for problem solving in small groups. The first part of the class teaches the skills described above. The second part is devoted to the class project, creating a computational research project of their choosing.

Some prior computing experience is desirable. For example, we expect that given a CSV file you can open it and plot the data in a language like MATLAB, IDL, R, or Python. A Software Carpentry bootcamp, Codeacademy, or similar MOOC would be appropriate venues to learn these skills. Lessons include, e.g.:

Access and Accomodations

Your experience in this class is important to us. If you have already established accommodations with Disability Resources for Students (DRS), please communicate your approved accommodations to me at your earliest convenience so we can discuss your needs in this course.

If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but are not limited to mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924 or uwdrs@uw.edu or disability.uw.edu. DRS offers resources and coordinates reasonable accommodations for students with disabilities and/or temporary health conditions. Reasonable accommodations are established through an interactive process between you, your instructor(s) and DRS. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law.

Religious Accommodations:

Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form (https://registrar.washington.edu/students/religious-accommodations-request/).

Classroom Climate The UW Master of Science in Data Science seeks to ensure all students are fully included in each course. We strive to create an environment that reflects community and mutual caring. We encourage students with concerns about classroom climate to talk to Melissa or contact MSDS staff at uwmsds@uw.edu

Remote Online Access Participation in this course requires students to access Internet resources. Specifically, students in this course will need to access UW resources including Canvas, UW Libraries which require users to login with a UW NetID, and some external resources such as Zoom, Google Docs, YouTube, and/or eBook websites. For students who are off-campus and are in a situation where direct access to these required resources is not possible, UW IT recommends that students use the official UW VPN, called Husky OnNet VPN (see instructions are below). However, students who are outside the US while taking this course should be aware that they may be subject to laws, policies, and/or technological systems which restrict the use of any VPNs. Students are responsible for their own compliance with all laws regarding the use of Husky OnNet and all other UW resources.

UW-IT provides the Husky OnNet VPN free for UW students via this link and advises students to use it with the “All Internet Traffic” option enabled (see the UW Libraries instructions and UW-IT’s FAQs regarding the Husky OnNet VPN). Doing so will route all incoming and outgoing Internet through UW servers while it is enabled.

Land Acknowledgment

The University of Washington acknowledges the Coast Salish people of this land, the land which touches the shared waters of all tribes and bands within the Duwamish, Suquamish, Tulalip and Muckleshoot nations.

Academic Integrity and Misconduct

We will follow the UW College of Engineering policies on academic integrity and misconduct. You can view them here.