A compilation of online video courses for learning about probability and statistics.




Learning Statistics through Online Video Courses

I have a strong interest in some Mathematical topics, Statistics being one of them. I find it fascinating how we can learn about purely random processes through purely rational mathematical analysis. Unfortunately, the one course I had in college about probability and statistics, , was only an introductory course (besides the fact that it was fading in my memory recesses). So I set to myself a goal of learning more about this topic. To my delight, unlike when I was going through college, today you have access to thousands of free online courses, often from some of the best Universities in the world. If you are motivated and persistent enough, you can learn just about anything you set out to. It's a wonderful world we live in today when it comes to education; don't let anyone tell you otherwise! As a side note, I'm reading an Isaac Newton biography where it's explained how difficult it was to access blank sheets of paper in his times. When he got hold of some blank paper sheets "through inheritance" he worked his theories in them diligently, often using very small characters to save space and thus paper. Think of this before complaining about any lack of resources you might have. Often motivation is the resource that is really lacking; not anything material (at least in the developed World, of course).

But I digress; I was able to find some wonderful resources online. Some are better than others, some more theoretical, others more practical. I decided to compile a list in this article so that others with a similar interest can (hopefully) save some time.

But first, a few words about my selection criteria. I have a background in Electronics and Telecommunications Engineering so naturally, I tend to gravitate towards the more practical aspects of statistics. If they also have applications in my particular line of work, the better. I do not in general work with, or am much interested in, statistical problems in the realm of the humanities, biology or medicine. Though a lot of the courses below cover statistics applicable to these fields, that was not a primary concern in the selection, so keep that in mind. As for the difficulty level, you will find everything from very basic courses to some more advanced (graduate level) ones. In all cases I stayed away from purely mathematical or very "formal" approaches. I prefer some moderately rigorous coverage (prove some essential theorems, etc) followed by practical application examples. With that in mind, here's the list:

1) Khan Academy – Probabilities and Statistics - Salman Khan

If you haven’t heard about Khan Academy, this is the non-profit educational website started by Salman Kahn. A great service to humanity it is. You can learn about many topics in this website, but his video lessons on statistics are some of the best (and earlier) content on the site. Salman Khan has a background in Engineering and Finance so that comes through in his lessons. This is probably the easiest set of courses to follow on this list, and the ones with most “practical” examples. Here you can get a reasonably good understanding of statistics, basic distributions, concepts like mean, variance and Expected Value, etc. He also covers some topics on inferential statistics like hypothesis testing and ANOVA (Analysis of Variance). I often started watching a video in some other website and then turned to Khan academy for a more practical approach. It worked well. Salman Khan is a very good teacher and doesn’t hide behind formalisms. However, if you really want a more in-depth understanding of some of the concepts used here, you should complement Khan Academy with other courses. Salman Khan doesn’t prove too many theorems in his lessons and tries to keep mathematical rigor to the minimum necessary for clarity.

2) Harvard - Stats 110 – Prof. Joseph K. Blitzstein

This is one of my favorite introductory Statistics courses. Professor Joseph K. Blitzstein is an excellent instructor, probably the best I’ve seen on this topic. He provides clear and in-depth explanations, interesting examples, historical perspective and even some humor.
If you are looking for the fundamentals of probability and statistics, including proofs for most theorems, this would be the place to start. His course does require a good foundation on calculus, but should be within reach of most people with an undergraduate education in Mathematics. Be sure to also access the Harvard course website. Professor Blitzstein provides excellent exercises with solution. He even published a summary/refresher of the mathematics concepts needed for the class. All in all, this is an excellent resource that I highly recommend.

3) MIT - Probabilistic Systems Analysis and Applied Probability – Prof. John Tsitsiklis

This is another good introductory course on probability and statistics. Prof. John Tsitsiklis is from the Electrical Engineering department at MIT and that shows in some of his examples, though most of the material is generic and applicable to other fields. His presentation style is clear and concise. Unlike Prof. Joseph K. Blitzstein in the course above who uses the blackboard almost exclusively in his lectures, Prof. John Tsitsiklis uses a mix of projected transparencies and blackboard. The content of this course is not as in-depth as Harvard - Stats 110 but, on the other hand, it covers a few more topics. The final lectures in this MIT course cover topics of estimation and hypothesis testing for example that are absent in Harvard - Stats 110. Therefore, ) MIT - Probabilistic Systems Analysis and Applied Probability is also an excellent resource that complements other courses in this list in some regards.

4) University of Texas at Austin - Opinionated Lessons in Statistics – Prof. William H. Press

“Opinionated Lessons in Statistics” is a more advanced course than the previous ones in this list, and therefore should not be your first course in Statistics. Though it does cover a number of basic topics, it also goes into more advanced methods like Bootstrapping, modeling and more advanced computer methods. The delivery style is also quite different from the other courses. You never see the lecturer here, but instead you see a sequence of well though-out power-point foils narrated by the professor. His explanations are very clear and he finds some relevant and interesting examples for each of the topics discussed. Prof. William H. Press works at the University of Texas in the fields of Computer Science and in Integrative Biology. Therefore, many of his examples tend to come from these fields as well. I have not gone through all the course yet, but the topics I did explore were excellent and therefore I highly recommend this course for more advanced topics.

5) Other Courses

The following seem to be also very good resources though I don’t feel I have watched enough of the videos to issue a qualified opinion. The videos I did watch were very informative though:

Mathematical Monque - Probability Primer

A series of videos giving an introduction to some of the basic definitions, notation, and concepts one would encounter in a 1st year graduate probability course.

Math Holt - Mathematical Statistics for High School Teachers

A more basic set of lectures but with some good explanations.

Machine learning courses

Machine learning is really another name for "statistical inference" but with a computer science angle to it. This has recently become a very popular field of study since the practial applications in the real-world are enormous. I've watched two online courses that really stand-out in my mind. Beware that these are graduate level courses, so some background on statistics (see above coureses) and linear algebra will help follow the material.

Caltech - Machine Learning Course - CS 156

This should probably be your first course in machine learning. It does not cover as many topics as the Stanford course below, but it covers the fundamentals very well. The first couple of classes cover the foundations (learning theory, VC dimension, generalization, etc) and may be a bit theoretical for some, but the instructor then proceeds to more practical concerns and techniques (linear models, neural networks, support vector machines, just to name a few). I particularly liked the way he covered SVMs (Support Vector Machines); it was the best coverage of the topic I found anywhere. To top it off, his explanation of how Radial Basis Functions relate to SVMs was very elegant and intuitive. The course has an excellent teacher (Professor Yaser Abu-Mostafa) who is a clear-spoken, likeable fellow, and really makes an effort to explain the material verbally along with excellent visual aids. I also enjoyed how he dedicates the last part of each lecture for Q&A. This helps you understand material that might not have been clear in the lecture portion. This is an excellent course, and I feel very previleged to have been able to access it for free on Youtube.

Stanford - Machine Learning Course 

Professor Andrew Ng's course is the most comprehensive free Youtube course I found on machine learning. He tries to cover a huge number of topics in this class, so the material can be difficult to follow at times. However, for an advanced student looking for something beyond the basics, you will be very pleased with the course. The course covers topics such as unsupervised learning algorithms that are not found in most other courses (like the one above). Professor Andrew Ng is also a really good instructor, and you get a sense he cares about the subject matter and whether his students are learning or not (no pun intended).

Comments, questions, suggestions? You can reach me at: contact (at sign) paulorenato (dot) com