MIT - 15.071x: The Analytics Edge
"Analytics" is a fancy name for "data mining", or perhaps it is the other way around? Anyway, it's the science of applying basic statistics to make sense of data, and/or to make better informed guesses. Examples include Amazon's smart suggestions, medical prevention (based on a number of variables, estimate the probability that a given person develops a serious condition), or risk assessment (what insurance premium should a 81-year-old male who drives 6000 km per year in an Audi but has had no accidents in ten years pay?) It includes such trendy sub-fields as machine learning and data visualization.
This course is a hands-on introduction to the field and R, a widely-used open source data-analysis-oriented programming language (sort of the open source alternative to SAS). As such it's pretty good, although at times I am slightly annoyed with the dumbing-down (they don't have to tell us every time to "Hit Enter" after typing a command − after 8 weeks, you'd think the students who hanged on had got the idea).
One of the better ideas they had was to hold a competition during week 7: based on a training set, students are supposed to submit predictions for a testing set; they are graded on the "Area Under the Curve", sort of a measure of the accuracy of the predictions. That week has probably been the one when I've learned the most; basically we've been thrown into the sea and had to learn to swim by ourselves. However, having this as a competition (students are ranked) is not such a great idea: with only a handful of tools at our disposal, all students are within a few percent of each other, which makes random fluctuations (estimations do include some randomness) disproportionately important. With a score of 0.727 I am currently ranked in the 642nd (out of about a thousand students who have submitted predictions); but the best student only has a score of 0.75 or so. While I have no doubt that he has a slightly better model than mine, I feel it slightly discouraging to be ranked according to a randomness-including metric.
Anyway, I fully intend to follow the course through to the certificate.
This course is a hands-on introduction to the field and R, a widely-used open source data-analysis-oriented programming language (sort of the open source alternative to SAS). As such it's pretty good, although at times I am slightly annoyed with the dumbing-down (they don't have to tell us every time to "Hit Enter" after typing a command − after 8 weeks, you'd think the students who hanged on had got the idea).
One of the better ideas they had was to hold a competition during week 7: based on a training set, students are supposed to submit predictions for a testing set; they are graded on the "Area Under the Curve", sort of a measure of the accuracy of the predictions. That week has probably been the one when I've learned the most; basically we've been thrown into the sea and had to learn to swim by ourselves. However, having this as a competition (students are ranked) is not such a great idea: with only a handful of tools at our disposal, all students are within a few percent of each other, which makes random fluctuations (estimations do include some randomness) disproportionately important. With a score of 0.727 I am currently ranked in the 642nd (out of about a thousand students who have submitted predictions); but the best student only has a score of 0.75 or so. While I have no doubt that he has a slightly better model than mine, I feel it slightly discouraging to be ranked according to a randomness-including metric.
Anyway, I fully intend to follow the course through to the certificate.
ANU − ASTRO1x Greatest Unsolved Mysteries of the Universe
An introduction to astronomy and astrophysics, by two very high-level researchers from the Australian National University, one of whom has won a Nobel prize no less.
This course is very introductory, so aimed squarely at students with little math or physics baggage, and that's fine for me (while I can do the math involved here with my eyes closed, I haven't really ever studied quantum physics beyond reading pop-science books; I dimly remember a chemistry professor writing down the Schrödinger equation on the blackboard, but it may have been only to scare us away from the subject.)
This course I am taking for fun, and for bragging rights (I've been taught by a Nobel winner, can you say the same?) I'm doing the homework while it's easy and not too time-consuming, but I guess I'll stop at some point, and doubt I'll be taking the final exam.
University of Copenhagen − Diabetes, a Global Challenge
My 4-year-old son has Type 1 diabetes mellitus, so I am naturally interested in the topic. This course however mainly discusses Type 2, which is an altogether different (and more poorly understood, although much more prevalent) disease. The approach is "panoramic": each week, a different professor gives a lecture on a particular aspect of the diabetes challenge; so far we've had an epidemiological overview, a biochemical rundown of the metabolic processes involved, a lecture about physical exercise and diabetes prevention, and one about "the clinical manifestations of diabetes".
Obviously, each week will be variably interesting. While the first three were very involving, the fourth was a lot harder to follow. Still, I find it worthwhile to stick to it. I don't, however, give special importance to getting a certificate for this course so I won't do anything beyond lectures (homework, peer-reviewed essays, etc.) that takes me more than, say, half an hour.
Harvard − PH525x Data Analysis for Genomics
Another R / data mining course. It's deeper (and more focused) than the MIT one, so potentially more interesting (but harder). Besides, the instructor is the author of Bioconductor, the most-widely used biological data analysis package for R, so it's nice to have him talk about this stuff (he's a good teacher too).
However I don't really have the mental bandwidth to keep up with the course (on top of the other ones plus work). It's extremely unlikely I'll get a certificate or even listen to all lectures; I haven't decided whether to drop the course altogether or stay registered to go back to the archived course whenever I feel ready for it.
No comments:
Post a Comment