MS-DSPP Resources for Incoming Students
Resources for Learning R
Data Camp:
- Introduction to R – Free – useful if you’re new to data structures in R.
- Intermediate R – Paid – Interactive lessons on learning conditional statements, loops and other vector based operations in R.
- Introduction to the Tidyverse – Paid – Covers the basics of data manipulation and visualization using the dominant package ecosystem in R.
- Data Manipulation with dplyr – Paid – delves further into data manipulation methodologies using the tidyverse package ecosystem.
- Joining Data with dplyr – Paid – delves into the basics of joining disparate data objects.
Wickham, Hadley, Garret Grolemund. (2017). “R for Data Science”. O’Reilly. Free online version available at: https://r4ds.had.co.nz/. This is a fantastic introduction to data science methodologies in R.
Swirl: learn R, in R. [Free] swirl teaches you R programming and data science interactively, at your own pace, and right in the R console. This package is useful for getting a few for working with data structures in R.
Data Carpentry: Introduction to R. [Free] A tutorial based walkthrough designed to get you started with object assignment, functional operations, and vector manipulations in R.
Resources for Learning Python
Data Camp:
- Get started with Python and Anaconda – Free – Great first place to start if you’re completely new to python.
- Introduction to Data Science in Python – Paid – survey of the basic data science functionality in python with an initial emphasis on data importation and visualization.
- Data Types for Data Science in Python – Paid – learn about the fundamental data types in python.
- Data manipulation with pandas – Paid – delve deeper into using the pandas module to read and process data in python.
Sweigart, Al. (2020). “Automate the Boring Stuff with Python”. Pollock. Free version online: https://automatetheboringstuff.com/2e/chapter0// Provides practical programming exercises that go beyond the fairly bland tutorials on other free websites.
Python Exercises, Practice, and Solutions offered by w3resources: https://www.w3resource.com/python-exercises/
Corey Schafer’s YouTube Series on Object Oriented Programming in Python: https://www.youtube.com/watch?v=ZDa-Z5JzLYM&t=40s
Resources for Learning Commandline
Data Camp:
- Introduction to Shell – Free – basics on using the UNIX command line to manage programs and software on your machine. Getting comfortable with the command line is essential to data science work. This introductory course will help ease you in.
Code Academy:
- Learn the Command Line – Paid – a beginners interactive course on using the command line for programming.
Resources for Learning Math
3Blue1Brown: The most important part of linear algebra and calculus, especially in data science, is having an intuition of how it works. These YouTube videos made by Grant Sanderson have been central to making mathematical concepts “click” for students in the past. We suggest watching these videos, which are more conceptual, and then delving into the actual mathematical notation and derivations offered through Khan Academy.
Khan Academy: If you’re weak on any of these mathematical concepts, it’s worth taking the time to go through the Khan Academy sequence. The videos are very detailed and incorporate practices to test comprehension.
- Trigonometry – this sequence covers all the concepts you’d need to know from trigonometry. Trig function factor into a lot of data science, especially when dealing with vectors and other linear algebra concepts.
- Linear Algebra – A solid understanding of linear algebra is key to being a good data scientist. Linear algebra factors into statistics, machine learning, network science, and computer programming. If you’ve never learned LA, or are weak at it, this is worth your time to go over.
- Differential Calculus & Multivariate Calculus – Calculus factors into statistics and machine learning. Understanding concepts like a derivative, gradient and optimization will greatly help you throughout your course work next year.
Sahota, Harpreet. “The Math You Absolutely Need for Data Science”. The Artists of Data Science.