Data Science Resources (All in one)

Book lists:

  • Speech and Language Processing (3rd ed. draft)
  • Computer Age Statistical Inference
  • The Elements of Statistical Learning
  • Reinforcement Learning: An Introduction
  • Applied Predictive Modeling
  • Pattern Recognition And Machine Learning
  • Mining of Massive Datasets
  • On Machine Learning with Scikit-Learn and TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems
  • Bayesian Data Analysis
  • Machine Learning for Hackers
  • Python for Data Analysis

Course list:

  • Udacity software engineering: 1, 2, 3 -Ongoing-
  • Stanford 224n
  • Topics in Mathematics with Applications in Finance (MIT): Youtube, Course
  • FAST.ai part2: http://course.fast.ai/part2.html
  • CS 294: Deep Reinforcement Learning: http://rll.berkeley.edu/deeprlcourse/
  • CMU 701 by Tom Mitchell: http://www.cs.cmu.edu/~tom/10701_sp11/lectures.shtml
  • Cryptography: https://www.coursera.org/learn/crypto 
  • Statistical Rethinking: http://xcelab.net/rm/statistical-rethinking/
  • Probabilistic Graphical Models: https://www.coursera.org/specializations/probabilistic-graphical-models
  • Bitcoin and Cryptocurrency Technologies:https://www.coursera.org/learn/cryptocurrency
  • Compiler:https://lagunita.stanford.edu/courses/Engineering/Compilers/Fall2014/about
  • NTU - Machine Learning (2017,Fall) http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML17_2.html

    Table of Contents

  • One Month Plan
  • Machine Learning
  • Natural Language Processing
  • Deep Learning
  • Systems
  • Analytics
  • Reinforcement Learning
  • Other Courses
  • Interviews
  • Bayesian
  • Time series
  • Quant
  • More Lists

One Month Plan:

You may find the list overwhelming. Here is my suggestion if you want to have some basic understanding in one month:

Machine Learning:

- Videos:

- Textbooks:

  • Introduction to Statistical Learning: pdf
  • Computer Age Statistical Inference: Algorithms, Evidence, and Data Science: pdf
  • The Elements of Statistical Learning: pdf

- Comments:

Statistical Learning is the introduction course. It is free to earn a certificate. It follows Introduction to Statistical Learning book closely. Coursera Stanford by Andrew Ng is another introduction course course and quite popular. Taking either of them is enough for most of data science positions. People want to go deeper can take 229 or 701 and read ESL book.


Natural Language Processing:

- Videos:

- Books:

  • Speech and Language Processing (3rd ed. draft): Book
  • An Introduction to Information Retrieval: pdf
  • Deep Learning (Some chapters or sections): Book
  • A Primer on Neural Network Models for Natural Language Processing: Paper. Goldberg also published a new book this year

- Packages:

  • NLTK: http://www.nltk.org/
  • Standord packages: https://nlp.stanford.edu/software/

- Comments:

The basic NLP course by Stanford is the fundamental one. SLP 3ed follows this course. After this, feel free to take one of the three NLP+DL courses. They basically cover same topics. The Stanford one have HWs available online. CMU one follows Goldberg’s book. Deepmind one is much shorter.

- More:

Some other people’s collections: NLP, DL-NLP, Speech and NLP, Speech, RNN


Deep Learning

- Videos:

  • Ng’s deep learning courses: Coursera. This specialization is so popular. Prof. Ng covers all a lot of details and he is really a good teacher.
  • Tensorflow. Stanford CS20SI: Youtube
  • Stanford 231n: Convolutional Neural Networks for Visual Recognition (Spring 2017): Youtube, Couse page
  • Stanford 224n: Natural Language Processing with Deep Learning (Winter 2017): Youtube, Course page
  • The self-driving car is a really hot topic recently. Take a look at this short course to see how it works. MIT 6.S094: Deep Learning for Self-Driving Cars: Youtube, Couse page
  • Neural Networks for Machine Learning by Hinton: Coursera. This course is so hard for me but it covers almost everything about neural networks. Prof. Hinton is the hero.
  • FAST.ai: Course

- Books:

  • Deep learning book by Ian Goodfellow: http://www.deeplearningbook.org/. Very detailed reference book.
  • ArXiv for research updates: https://arxiv.org/. I found it the mobile version of Feedly is useful to follow ArXiv. Also, try https://deeplearn.org/ or http://www.arxiv-sanity.com/top.

- Other:

- Comments:

Ng’s courses are already good enough. Reading Part 2 of Goodfellow’s book can also be helpful. Learning one kind of DL packages is important, such as Keras, TF or Pytorch. People may choose a focus, either CV or NLP. People want to have deeper understanding of DL can take Hinton’s course and read Part 3 of Goodfellow’s book. Fast.ai has very practical courses.


Systems:

  • Docker Mastery: Udemy
  • The Ultimate Hands-On Hadoop: Udemy
  • Spark and Python for Big Data with PySpark: Udemy

Analytics:


Reinforcement Learning:

- Videos:

  • Udacity: Course
  • UCL Course on RL by David Silver: Course page
  • CS 294: Deep Reinforcement Learning by UC Berkeley, Fall 2017: Course page

    - Books:

  • Reinforcement Learning: An Introduction (2nd): pdf

Others:


Interviews:

- Lists with Solutions:

  • 111 Data Science Interview Questions & Detailed Answers: Link
  • 40 Interview Questions asked at Startups in Machine Learning / Data Science Link
  • 100 Data Science Interview Questions and Answers (General) for 2017 Link
  • 21 Must-Know Data Science Interview Questions and Answers Link
  • 45 Questions to test a data scientist on basics of Deep Learning (along with solution) Link
  • 30 Questions to test a data scientist on Natural Language Processing Link
  • Questions on Stackoverflow: Link
  • Compare two models: My collection

- Without Solutions:

  • Over 100 Data Science Interview Questions Link
  • 20 questions to detect fake data scientists Link
  • Question on Glassdoor: link

Topics to Learn ->


Bayesian:

- Courses:

  • Bayesian Statistics: From Concept to Data Analysis: Coursera
  • Bayesian Methods for Machine Learning: Coursera
  • Statistical Rethinking: Course Page (Recorded Lectures: Winter 2015, Fall 2017)

- Book:

  • Bayesian Data Analysis, Third Edition
  • Applied Predictive Modeling

Time series:

- Courses:

- Books:

  • Time Series Analysis and Its Applications: Springer

- With LSTM:

  • https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/
  • https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/
  • More: https://machinelearningmastery.com/?s=Time+Series&submit=Search

Quant:

- Books:

  • Heard on the Street: Quantitative Questions from Wall Street Job Interviews by Timothy Falcon Crack: Amazon
  • A Practical Guide To Quantitative Finance Interviews by Xinfeng Zhou: Amazon

- Courses:

- Other:

  • A Collection of Dice Problems: pdf

More:

  • Computer Science courses with video lectures: https://github.com/Developer-Y/cs-video-courses
  • The Open Source Data Science Masters: http://datasciencemasters.org