Tuesday, November 11, 2014

MOOC status, November 2014

Hi.

Yeah, I know, I haven't kept this very well-updated. But let's see... The last post was just about two months ago. What's happened since then?

I've (successfully) finished the following courses:

  • Introduction to Systems Biology - Mount Sinai
  • Introductory Human Physiology - Duke
  • Statistical Inference - Johns Hopkins
  • Dinosaur paleobiology - U. Alberta (well, the course is still running, but I've done all the activities, so I'm done with it)
I've dropped the following courses:


  • Fundamental of Neuroscience part 2 - Harvard
  • Musculoskeletal Anatomy - Harvard
  • Data Analysis and Statistical Inference - Duke
I have started the following courses (some of which weren't quite planned for):
  • Immunology part 2 - Rice
  • Astrophysics part 3 - ANU
  • Exploring Neural Data - Brown
  • Experimental methods in systems biology - Mount Sinai
  • Functional Programming - Delft
By and large, I won't be starting any other "Big MOOCs" this year. I am considering signing up for The Neuroscience of Vision from MIT, as that's a short, 4-week course. If it's too heavy-going, I can always drop it - I find I am increasingly doing that kind of thing: sign up to a course to give it a go, then drop it if it doesn't quite fit what I want, or if it really doesn't fit in my schedule.

All in all, that's 29 courses I've finished. 11 are biology / life sciences, 9 are statistics / data science, 5 are regular computer science, the rest are a smattering of economics, physics, humanities, etc. By the end of the year, barring disasters I should have at least five more.

I'll do (maybe) detailed writeups of the courses I finished, so let's just do a quick recap of the ones I dropped:
  • Neuroscience from Harvard: this course is actually pretty good and has some stunning graphics. Unfortunately, the focus is very much on visuals, animations, etc. It certainly works for some. As for myself, I really can't find it in me to watch cartoons about house parties as a metaphor for the synapse.
  • Musculoskeletal Anatomy: I don't what to think of this one. Either the course runners are incompetent and/or have lost interest, or something terrible has happened (like a disease, an accident, something). The first couple of weeks were pretty sleek, with professional-looking videos. It's gone downhill since then, the syllabus has been truncated (each week was initially supposed to finish with a wrap-up about the "case" under examination), the content is released late and only consists of pages of text, the quizzes have glaring mistakes that are not corrected, all the professors and TAs have fled the forums (not a single post from a member of staff in over three weeks). It looks like they're scrambling to put up some content every week but are improvising with very limited resources. It's the first time I drop a course when it's almost over, but I find I can't find the motivation to keep going. It feels like standing on a sinking ship.
  • Data Analysis and Statistical Inference - I've dropped this one purely for scheduling reasons. It's a very good course, perhaps too introductory at times for me, but it broaches many subjects like ANOVA and such. The course is offered again next March, so I'll be taking it then.
As for the current ones:
  • Immunology is hard on memorization, but very interesting and the professor is great. We're onto T-cells now, pretty complicated stuff.
  • Astrophysics is a lot of fun. Dabbling in relativity and quantum mechanics without the hard-core maths. It's actually pretty relaxing.
  • Exploring Neural Data is a pretext for doing scientific computing in Python (instead of R, for instance). The lectures are engaging and the assignments are pretty thorough. Unfortunately, it's rather short: there's only a unit every other week, to accommodate students without a programming background, and so there are only 5 units altogether, each with an assignment that takes me, I don't know, a few hours to complete. So it stays pretty basic.
  • Experimental Methods in Systems Biology - a follow-up to the Introduction to Systems Biology class. It's the part I am the least interested in of all the Sys Bio courses, but it's an understandably requirement to take them all. Anyway, it's a description of the major technologies used in major biology labs today: Illumina sequencing, mass spectrometry, etc.
  • Functional Programming - I didn't plan on taking this one - I mean, functional programming is fun but I've already dabbled in it (and still do in a limited way, thanks to Java 8 streams). Simply I chanced on a video of the professor, who is kind of a heavyweight in the field (used to be a principal scientist or something at Microsoft Research, author of a ton of papers, etc. - still an open source fan, as far as I can tell, despite having worked at MS) and decided to take it just for kicks.

Tuesday, September 16, 2014

MOOC status, September edition

It's been a while since I've blogged; basically I've been busy with a lot of MOOCs on top of a lot of work. So, what's up?

First, let's do the numbers thing. I've racked up something like 25 course certificates in just about a year. Of these, 7 are verified. If we break them down by general topic, we have:

  • statistics / data science: 9
  • biology / medicine / life sciences: 7
  • computer science / programming: 5
  • astrophysics: 2
  • economics: 2

(Of course, the categories are somewhat arbitrary: much of the data science thing could be classified as computer science; Introduction to Bioinformatics counts as CS but Quantitative Biology − which was about using Python and Matlab and R to analyse biological data − counts as biology. So don't take the count as an absolute, more as an indication of where I'm going.)

I'm currently embarked (with paid certificates and all) in two Coursera specializations: the Johns Hopkins Data Science one (which I'm halfway through already) and the Mount Sinai Systems Biology one (which I'm only three weeks into, out of about a year, all told). I guess I'll blog some more about these; generally speaking I'm finding the Data Science one pretty good once one gets to the meat of it (the introductory courses about R are… introductory, but the projects are okay; the more mathematical "statistical inference" course is quite good) and while Systems Biology course has a very tough start, it gets easier once we've hit stride. It's quite advanced, which is what I'm looking for, and that's pretty satisfactory.

Anyway, I'm currently enrolled in the following courses:

  • Systems Biology
  • Introductory Human Physiology (a great backgrounder for sys.bio.)
  • Data analysis and statistical inference (to keep doing stats, but I'm really auditing only)
  • Dinosaur paleobiology
  • Fundamentals of Neuroscience part 2: Neurons and Networks
Also, I'm enrolled in JH's Statistical Inference, but it's a repeat from last month: then I didn't know if I could afford the time to complete this course so I did it "for play", without certification. This month I'm only redoing it with certification. So I did spend a few hours polishing up my course project, but that's really all.

And the future brings the following:

  • Immunology part 2
  • Astrophysics part 3: the violent universe
  • Anatomy
  • Neural data analysis
  • Another 4 modules of Data Science (out of 9)
  • Another 4 modules of Systems Biology (out of 5)
So… planning things out, we see a very hard last week of October (when some courses haven't quite finished and others have just started) with 10 courses altogether. But I guess I'll survive.

Thursday, August 28, 2014

Back to school (sort of)

So the most rainy month of the year (in Paris − not that that's usual, mind, and not that I complain: I prefer wet Augusts to sweltering ones, to be sure) is drawing to a close. The academic year is starting, my former local sub-mayor is now education minister, and MOOCs are starting left and right. Not that they've stopped, really… Anyway, time for a sum-up:

Courses that are ending

Astro2, Exoplanets (Australia National University)

Well, Astro1 was great fun, Astro2 almost as much. I say “almost” not because the course itself is less good, but because by necessity it's a lot about the technology behind exoplanet discovery − not something I'm very interested in. Still, the staff at ANU made it a lot of fun, so there.

(It'll be my 15th edX certificate, 20th overall!)

The Emergence of Life (University of Illinois, Urbana-Champaign)

A big disappointment. Broadly, the course is supposed to be a quick run through the history of life as we know it (and not so much about it's emergence, really, but at least that's up front). The problem is it's dumbed-down, inaccurate, and the lectures are quite confused. I'm sticking to it because there are chunks I don't know about (“spot-the-fossil” and the skeletal morphology criteria for classification, mostly), but more often than not I find I'm shaking my head. It's less like a class and more like a 

No certificates here, they're not free and definitely not worth paying for.

A bunch of Data Science courses (Johns Hopkins university)

I know I wrote I wouldn't tackle the Specialization… but I had second thoughts, so I registered for Signature track on the first four modules of the Specialization, plus auditing the Statistical Inference one (which I had heard many people complain about, saying it's hard and obtuse.) Actually… once one does the projects and goes beyond the first week or so of each course, they're getting pretty good. The Statistical Inference course tries to run through a lot of unintuitive material in really too little time, but − after having done UC Berkeley's Introduction to Statistics course − I find it very interesting, very stimulating. I'm glad I'm only auditing it − this way when I take it “seriously” it'll be a review and hopefully by then I'll understand it better.

In any case, I expect four verified certificates to land in my pocket in the coming few weeks.

So if I'm counting right, I'm virtually the proud owner of something like 24 certificates. 20 obtained in 2014. Not bad… 


Upcoming plans

I've rearranged a bit my planning, ditching a number of accessory courses that I couldn't seriously fit along the rest. Still, next week is a busy one, with no less than 5 courses starting at the same time.

Explore Neural Data, Brown

Data analytics + neurology. In Python. Cool.

Fundamentals of Neuroscience part 2, Harvard

More neuroscience! Actually I'm mostly taking this one because I took the first part. Not sure I'll keep both neuro courses (then again, at the time I thought little of it, but after a while I find I keep using the concepts of Neuro 1; so it's been a good use of my time.)

Dino 101, U. Alberta

I don't expect much from this one, a lightweight dino course to pass the time.

Introduction to systems biology, Mount Sinai

I tried this a while ago and dropped it after a week, thinking it too hard. Hopefully, I've learned a bit since then, and MIT's 7.QBWx rekindled my interest in systems biology.

Introductory Human Physiology, Duke

No, I don't want to be a doctor. But yes, physiology and anatomy are interesting. This promises to be a heavy-workload class; we'll see if I keep it through.

Astro3, The Violent Universe, ANU

I can't stop halfway through the series! Sadly, it starts in October and I may be too busy to give it my full attention. Hopefully things will pan out all right.

Fundamentals of Immunology part 2, Rice

Ditto. I did the first part, which was great (though hard work), the timing of this second part isn't so good, but we'll do as we can.

Next

That's September and October pretty much spoken for. With luck, I'll be able to sneak in a Data Science course in there... I still have five full courses to do in the Specialization (Statistical Inference, Regression Models, Reproducible Research, Building Data Products, Practical Machine Learning). I've pencilled in one in October and two each in November and December. This way I still have some leeway until the next capstone project (expected in February).

I have also registered for the Open University's Start Writing Fiction course. Not sure I'll stick with it, but a writing class is interesting to say the least (even though English isn't my first language).

Saturday, August 9, 2014

The Coursera-Johns Hopkins Data Science specialization

Quick recap: the department of Biostatistics at Johns Hopkins School of Public Health offers a full-on specialization in “Data Science” through Coursera, consisting of nine courses and a “capstone project”. The specialization certificate is supposed to testify that students are proficient in getting data, formatting it, graphing it, extracting useful knowledge from it, drawing and communicating conclusions from it, and so on. With an emphasis on using R, although the skills are supposed to be broadly applicable to other systems.

In detail, the sequence is made of nine courses:

  • The Data Scientist's Toolbox
  • R Programming
  • Getting and Cleaning Data
  • Exploratory Data Analysis
  • Reproducible Research
  • Statistical Inference
  • Regression Models
  • Practical Machine Learning
  • Developing Data Products
The courses are free, but if one shells out for them ($49 or 35€ depending on your currency zone of residence) one gains access to a capstone project and a specialization certificate.

I haven't yet made my mind about doing the whole specialization, or simply taking the free courses. I have a handful of days to decide.

What's in it for me?

Well, I sort of know a lot of this stuff from before. Using git and github for collaboration is something I do daily, programming (though not in R) is my main living; plus of course I've taken MIT's The Analytics Edge (a business, hands-on oriented very intense course on using R for analytics) and UC Berkeley's Introduction to Statistics, so I know my way around most of the material.

I am therefore not in a “first time learner” situation − rather, the cursus is more about consolidating the knowledge I do have, formalizing it, and getting an overall certification to somehow “prove” my mastery of it (the acceptability of this proof − by potential employers and / or academics − remains to be assessed).

The courses themselves

So far − in one week! − I've taken five courses out of nine, and completed two. This may sound impressive, but it's not − as I said, I'm hardly a first-time learner.

The Data Scientist's Toolbox is a very short introduction to the overall specialization. The main point is to install RStudio and create a Github account. Doable (including the quizzes and project) in two hours, and generally dispensable (and part of the reason why I balk at doing the specialization − 35€ is very expensive for three clicks on a website).

R Programming gets a bad rap on review sites. I sort of understand why; it's a rather heavy-handed introduction to R, I guess it's pretty incomprehensible to those as never wrote a line of code in their life and pretty abstract for most that have never toyed with R.

For me, as someone who has used R but never been formally introduced to it (like, I never figured that everything was a vector and I had difficulty wrapping my head around the difference between single and double square brackets, to say nothing of the scoping rules and the notion of “environments”), it was a nice crisp clarification of the essential concepts of the language. I guess having it as a prerequisite for the rest of the sequence isn't a great idea: really, one can use R without understanding it, and understanding is better approached after a degree of use. Laying it out like this is very bottom-up, very French I would say: first slog through abstract concepts then learn to apply them − I have come to prefer the other way around: build an intuition then consolidate the knowledge and learn how to go further. Anyway; I did the whole course in a day and gaining a good understanding of how R works in the process, so that's no bad thing.

Getting and Cleaning data is broadly a walkthrough of R's data gathering libraries. How to connect to a database, how to download a file, etc. It suffers from the lecture-then-exercise syndrome. I'm not sure how I would tackle this, really, except by drawing pictures of what the different input formats are, what the general target is (a clean data frame in R), pointers to the documentation and hand-holding exercises on real data rather than a slow demonstration of each function on made-up data. The Coursera platform is a hurdle there: similar endeavours were much easier on edX, where you can have long exercises with multiple questions, each with immediate feedback − I'm thinking of the very time-consuming exercise sets in The Analytics Edge, which had much, much better learning value than the simplistic, submit-all-at-once quizzes that Coursera provides.

That said, the course isn't very challenging but it's useful stuff to know. I'm more or less taking the course on schedule, with a bit of an advance (working up the energy to do the week 2 quiz).

Exploratory Data Analysis is, similarly, a walkthrough R's graphics libraries, and has the same pros and cons as the previous course. Similarly, having already been exposed to most of it, I find the course a crisp recap of everything. I doubt I would enjoy it very much if it was the first time I was exposed to it.

Statistical Inference is the most-decried course of the sequence, so in order to decide whether to take the overall specialization or not, I registered for it. I understand its detractors: it's very fast, very abstract. Basically in four weeks, Prof. Caffo runs through the same curriculum as Prof. Adhikari did (albeit annoyingly slowly at times) in UC Berkeley's fifteen-week Introduction to Statistics, with the same issues as the rest of the course: it's quite technical and abstract, and rather difficult to connect to (though it's hard to be practical when explaining mathematical constructs). I guess I'll be referring to Adhikari's slides more than this course's native ones, but I don't really expect the course to be very challenging.

These were the five courses I've sampled. The rest are:

Reproducible Research
 is about communicating research by using R markdown and knitr to create live R-embedding documents. Interesting stuff generally, and I guess useful skills to have if one intends to do statistics professionally, but not burning enough that I prioritized the course, so I'll take it later. Four weeks seems kind of long to do that kind of thing.

Regression Models is the other mathematically-grounded course in the sequence, and the other stumbling block for students. I think I'll be okay with it, but realistically I can't sample it this month. We'll see in September or October.

Practical Machine Learning will likely be − again − a fomalization of stuff I know from The Analytics Edge.

Developing Data Products sounds like another course about communicating results, half about “good practices” and half about using Shiny (R's own web framework − why is it that all languages must have their web development framework? Even Fortran has a CGI interface…)

The Capstone project seems to be about wrapping this up in a real-life situation.

So… why am I considering taking the whole shaboodle?

Based on the courses I took so far (about half), the sequence is pedagogically deficient − or rather, it's traditional in its approach of stuffing lots of science in the face of students then expect them to go through with it. I expect they have a high dropout rate (even higher than the baseline MOOC rate). Comprehensive as it is, the sequence is ill-suited to people approaching the subject for the first time. The course page says there are no prerequisites in terms of analytics or programming, but I don't find it so: it's more a recap / advanced course than an introduction to the field.

In that way, it's broadly what I'm after. I don't feel like I know a subject until I've studied the theory a bit†. Since I am vaguely considering reinventing myself as a biostatistician or bioinformatician‡, or at least keeping my options open, it may be worthwhile investing a bit of time (and some euros) into it.

Oh well. It's a subjective arithmetic. If the course were stellar, I wouldn't hesitate long before paying. As it is, I dither.

Postscript

I slept (well, napped) on it. Re-reading myself, it's kind of obvious I'm not really interested in pursuing the specialization certificate; there are better ways to spend 350€ than to rehash mostly-known subjects.

I may decide otherwise at another time − all I need to do is retake the courses, which means a few days doing quizzes and projects again. Doubt it'll be difficult.

If I need some certification or other there are probably better ones around, starting with Duke's Data Analysis and Inference.


† “A bit” as in, I don't need to know how to prove a theorem to use it, but I need to know I am using a theorem rather than use a pre-baked recipe I'm not comfortable with improvising with.

‡  On the premise that it's more useful to the human race than general web-based development, and anyway the kids who've done a week of Node.js and therefore know everything there is worth knowing about computer science are taking the fun out of general programming.

Wednesday, August 6, 2014

Mopping up on Exoplanets

It's a bit unfair to say I'm “mopping up” − there are two full weeks of the course, about direct imaging (at last!) and Earth-like planets − but it's clearly on the way out, and it's getting possible to start thinking back about the course.

This has been a surprisingly (or maybe not, I'm not at all a student of astronomy) technical, more than scientific, course. I mean, Paul Francis did whip out his tablet to perform some calculations, but they were fairly simple, by and large, much more than the (already not very advanced) physics of The Greatest Mysteries of the Universe. Here, instead of big questions about gamma ray bursts and Type 1a supernovae, what we have is a celebration of the ingenuity of the engineers making possible something as staggeringly complex as detecting planets orbiting distant stars.

The engineer in me is happy − and it's true these are fantastic achievements.

The course itself follows the same format as Greatest Mysteries: every week has a topic (“radial velocities”, “gravitational microlensing”), which Paul Francis and Brian Schmidt discuss in a Socratic manner, which is an impressive way of saying they convey all the knowledge through dialogue, bouncing questions off each other. Schmidt takes something of a backseat here, often playing the naive novice who asks questions of Francis; maybe he's less comfortable with the topic than with cosmology (or maybe it's a subliminal message: you may have a Nobel Prize, you're still − always − in a position to receive wisdom from your peers). Both lecturers' enthusiasm (especially Francis') is still communicative. Besides the video lectures, we have each week a link to the papers discussed, a text summary of the lesson, a worked example, a graded problem (generally very easy) and a new episode of the Mystery.

Last time around, the mystery had us figure out a weird bouncing parallel universe. This time, we're still in a strange cosmos, but the issues are more technical: a red star seems on course to collide with the world, and we have to find a likely destination for the world's population. But of course, there's a twist…

I have to admit I haven't been as interested in this course's mystery as the last. Maybe it's the lack of bubbles, or maybe I'm just not very entranced by the nitty-gritty detail of surveying the sky, taking radial-velocity measurements, etc. I'll be happy to have the solution for the Mystery through the final exam, but I'm not really motivated enough to go beyond and investigate on my own.

That's perfectly all right. I'm not destined to be an astrophysicist (if I were, I guess I'd be more involved in finding a new haven for the Moggians), I'm there to have fun learning about stuff; and as far as fun is concerned, this course delivers.

Quick notes about The Emergence of Life

So we're in Week 4 of this U. Illinois course over at Coursera, which aims at reconstructing the history of life throughout geological time. Midway between “taxonomy for dummies” and “introduction to evolutionary biology”.

So far, it's… unequal. It's notable that the teaching staff are all geologists rather than biologists, so they're in their home ground when discussing fossil formation, perhaps less so when they're talking about molecular biology. In any case, I like the fossil-discussing segments, they're informative and help driving the geological time-scales into my head; plus I like weird beasts.

Where I'm less enthusiastic is that the lectures are disjointed, often approximative (like mixing the terms eukaryote, metazoan, multiple-celled organisms − y'know, plants are multi-cellular organisms, but they're not metazoans, likewise, there are these things called saccharomyces, amoeba, giardia, etc. : all eukaryotes are not multi-cellular). Sometimes they'll use an inappropriate picture to illustrate what's being discussed (illustrating armored jawless fish with a toothy placoderm isn't a great idea!) There's little logic in how a segment connects to the ones before and after. It's a bit annoying that the clearest segments are the ones from the very young PhD student introducing taxonomy, while the segments from the official professor are somewhat confused (and confusing).

But I can live with that. Playing spot-the-fossil in the quizzes is fun.

Another thing I find upsetting is that the forums are basically drowned in two kinds of posts:

  • corrections for approximations made in the lectures
  • creationist crap (multiple threads discussing intelligent design, “global warming: fact or fiction”, etc.)
Huh. Okay. I'll just steer away from the forums, then.

That, plus the outright idolizing of Carl Woese (can't we grow up beyond the “great man single-handedly upsetting the establishment” type of narratives?) means the course isn't all it's meant to be… oh well. It's still something to do of an otherwise quiet summer.

(That said, I like the funky music and titles.)

Monday, August 4, 2014

Johns Hopkins' data science specialization, round two

Two days ago I noted I ran through the first course of JHSPH's Data Science specialization in a handful of hours.

In fact, yesterday I did the same for the second course in the series, R Programming. But this time I didn't feel “cheated” (although that's a strong word): I found the course easy as pie because I'm an experienced programmer and I've already used R quite a lot in MIT's The Analytics Edge, however I lacked any formal(ish) introduction to the language from a computer scientist's point of view. It's not enough to know that you should type lm(x ~ y + z, data=mydata); I find it necessary to know that it's a functional language where the basic data type is the vector and where every function carries with it its own environment, with such-and-such scoping semantics.

Such an introduction needn't be long. But having it, I'm a lot more confident that I understand how R works, and therefore that I can use it correctly.

All this to say − yeah, I ran through the 4-week course in a day, but it doesn't mean it deserves its poor reviews.

Saturday, August 2, 2014

The Data Scientist's Toolbox - Fastest MOOC ever?

Since I have a rather quiet month of August, I browsed through the Coursera catalog, and found that all the courses from the Johns Hopkins Data Science specialization are repeated on a monthly basis; the next starting date is August 4th, which is next Monday.

So, cool. I signed up for “R Programming”, then figured out I'd check the other ones out, and well, to give me a taste of the sequence (because well, maybe I'll want to do the whole thing) I clicked on the introductory course, “The Data Scientist's Toolbox” − and found that although the course doesn't officially start until Monday, all the content is already available. And so, I clicked through…

… and finished the course four hours later (having taken breaks to give a bath to then feed and put to bed my toddler, then put dinner in the oven…)

Well, it's introductory, I already know the tools concerned (Github? I use it, like, daily…), so there wasn't much challenge. The “course project” was basically setup a Github account and install RStudio.

We'll see what happens with the R Programming course; if it's that easy… I don't know. Maybe I'll pony up (35€ per course; at a couple of hours a pop, that's much more expensive than the cinema) for the verified certificate if the following courses are good.

But in the meantime − it's pretty much the fastest MOOC I've ever done. Started and ended two days before it's due? I rock.

Wednesday, July 30, 2014

Finished: Genomics Medicine Gets Personal, Georgetown University

So, I clicked “submit” on the last question of the final exam a couple of hours ago, and I'm satisfied to note I have an overall 92% grade. But was it a good course?

Ooh, let's rewind a bit. This course is supposed to be many things, an introduction to the world of genomic, or precision, medicine for the layman as well as for medical students. In eight short weeks, we get maybe a dozen different people giving short lectures about their particular field, all in relation with genomic medicine. The lectures are arranged in four themes, which are broadly the clinic, the lab, business, and ethics. Dr Haddad, the main instructor, introduces everybody and conducts Q&A sessions, in order for the course not to feel too disjointed.

So, a success? Partly. The production is good, although maybe they've overdone it a bit. Before each video segment we have a full page of text explaining the pedagogical objectives of the segment. Big instructions in bold remind us to use the navigation bar to, erm, navigate the course. And so on. It feels somewhat dumbed-down.

The course content is, obviously, varied. Generally I liked the clinic theme and was bored with the other three (the lab one was very introductory). Although they emphasize that medical students may be watching the videos, in truth − I guess medical students have better things to do. The overall level of the course is very, very basic. Apart from a tidbit here and there (I wasn't aware of fluorescent in-situ hybridization), I can't say I have learned much during the course, and only stuck with it because it's summer and I had nothing better to do.

Hmm… I may sound too harsh. Let's say that if you approach this course as you would, say, one from MIT, then you'll be disappointed. Which is not to say there isn't merit; the team have made a good attempt at surveying the landscape of genomic medicine, primarily from doctors' points of view, and deliver something akin to an Internet-era pop-science book, and a pretty decent one at that.

So, would I recommend this course? Not generally, not to people with a utilitarian view of MOOCs, who measure success in skills and knowledge acquisition. But to specific people, maybe to people who're simply curious, who've heard about sequencing the genome, maybe about the recent judicial issues surrounding 23andMe, and who are willing to spend a couple of hours a week to find out more about it, yeah. Definitely.

Saturday, July 26, 2014

Midyear MOOC review

We're about halfway through the year - or at least, we're halfway through the summer holidays in the northern hemisphere, which in academic terms means we're at the midpoint of the year although we're 7/12th of the way to Christmas. Anyway; now's a good time for a review.

2013 was my first contact to MOOCland; 2014 is my first year of going full blast with Education 2.0. So far I've completed 14 courses this year, only one of them a carryover from 2013. I'm registered for an additional 13, and while I am almost certain to drop some, I'm fairly certain I won't stop there. As a comparison, last year I only completed 4 courses (6 if counting the two I audited); I've certainly gained some momentum there.

That's not the only difference. Not only did I get fully into this MOOC thing, I also have moved a bit from learning for learning's sake, and started focusing on a sort of pathway that may lead eventually to my career changing tracks. Which is funny, in a way, because I got onto the bandwagon to learn some economics theory, something that's never going to feature significantly in my work. Rather, I'm now gathering most of my efforts towards building a scientific curriculum centered around the general theme of computational biology. While my sense of fun is still my main driver in selecting courses, I've also come to pick up subjects in a utilitarian fashion, for instance taking all three Statistics courses from UC Berkeley (and the Analytics Edge from MIT) because I felt the need to brush up on the subject in order to better tackle the biology field.

Anyway; a half-year in review.

JANUARY saw the end of Harvard's slightly overproduced MCB80x - Fundamentals of Neuroscience. In retrospect, the course was better than what it felt at the time; I only wish there had been a quicker pace to it, and less time spent on cartoons and more on "test yourself" type exercises. (The instructor hates testing and grading − for reasons I can empathize with, while not really agreeing with − so this course has some virtual labs and a final exam, and that's it. No intermediate assignments, homework, etc. to fix things into the students' minds. Combined with a short-chapter-every-fortnight pace, that makes for a course that's very low-intensity and so, hard to commit to memory.)

It was also the start of CalTech's Principles of Microeconomics with Calculus, a very intensive course, very challenging, but quite rewarding too. I took it because, hell, I got into online ed to learn some econ, I wanted to swallow the whole pill. While the course ended up convincing me I didn't want to study economics for a livelihood (fat chance of me ending up that way even if I wanted to anyway) it managed to burn some basic economic concepts (supply, demand, monopoly, oligopoly, externalities, Pigouvian taxation, optimality, etc.) into my mind, which proves invaluable to my world-view generally. I haven't taken more economics courses since; when I get the motivation for it, there's the archived development economics course from MIT's Duflo and Bannerjee (both superstar economists in their own right) waiting for me.

In MARCH I started doing some serious MOOCing. It was the start of Berkeley's Stats course (which is both great and too easy − the contrast with their Californian neighbours at CalTech was terrible: CalTech's Rangel would have gone through the whole 15-week stats curriculum in, like, 5 weeks), that I took because I knew I was too ignorant in the way of statisticians for my own good; at the same time I started Rice's Fundamentals of Immunology (to keep doing some biology) and MIT's Analytics Edge (a very nice hands-on, reality-based complement to Berkeley's abstract, theoretical stats).

I also picked up MongoDB's DBA course at that time. It's a MOOC, running the edX software and produced in a very similar way to any of MIT's or Harvard's, but not offered by a university and very much focused on practical skills on a specific product, so I don't know if it really counts. I find it sociologically interesting though: the folks taking the course (or at least, those that were vocal on the forums) are very different from the MOOC-taking population on Coursera or edX. Let's be euphemistic and say it's a different mindset (in less charitable terms, a lot of people are only there to get a certificate to stick on their CVs and couldn't care less about the subject matter, or are plain stupid, or are whiny kids. Or all three at the same time.) Still, I admire the instructors' patience and I did learn a lot of useful stuff; never mind the forums.

Still in March, I started Copenhagen's Diabetes Challenge course (because my 4-year-old son has Type I diabetes and I'm very interested in the bio/healthcare sciences), which proved unequal but pretty interesting, Peking University's Bioinformatics course (which suffered from being dubbed in English by non-native speakers; it's awful to say that, but the delivery really hurt. Another big, big problem with that course is, well, honestly, how can you have an algorithmics course where you don't write a single line of code?) and ANU's first Astrophysics module (for no better reason than to get bragging rights, as in: "I studied under a Nobel prize winner"), which proved a blast (no pun on bioinformatics intended).

That was pretty much it (6 or 7 concurrent courses are pretty much my absolute limit) until the end of APRIL, when I started the eagerly-awaited Epigenetics course from Melbourne University, which picked up pretty much where MIT's 7.00x had left, genetics-wise, and was very, very good. Surprisingly difficult, because I don't know how to read scientific papers, really; but very stimulating. Around that time, I tried Mount Sinai's Introduction to Systems Biology and Harvard's Data Analysis for Genomics, but dropped both: they were too advanced, I was too tired. Instead, I refocused on finishing what I'd started (picking up Stat 2.2x 3/5th of the way through, and still managing an overall 65%) and cruised at 5-6 simultaneous courses until MAY, when a lot of courses ended. In retrospect, it's been my most fruitful month ever, as I picked up certificates for Analytics, Diabetes, Astrophysics, and Statistics (part 2).

May has also been the month of my biggest disappointment in MOOCland: while MIT's courses have generally been head and shoulders above the rest of the crowd, the Social Physics "buy-my-book" ad was a downright scam. I still don't understand how or why MIT and edX have let this pass, but hey, let's not take it against them and, well, be more selective in the future. It's a good reminder that the best can rub shoulders with the worst, I suppose, and that we should not take quality for granted.

On a similar note, I've been mostly an MIT/edX fanboy, because that's how I got into this MOOC thing, you know? But while I still prefer the edX platform overall (because of the more linear flow and richer interactive grading options, and despite the rubbish forums) I've gotten somewhat neutral. There are some great courses on Coursera too (starting with Melbourne's Epigenetics), and while the overall system feels more rigid, it's also less prone to bugs and delivers a consistently good experience − and the courses, well, they're often from less prestigious universities and tend to have less whizz-bang than the big kids at MIT-Harvard-Berkeley, but they're pretty good nonetheless. One just has to be more selective − and not hesitate to trial courses and drop the ones that don't fit.

JUNE has seen the end of Epigenetics and the start of a small bunch of courses: Georgetown's Genomic Medicine (which I feel ambivalent about: it's more an outreach program than an actual course. There are good things there, but no deep science − it's more of an extended documentary about the impact of genomic technology on the practice of medicine today. If nothing else, it reminded me of the difference between biological science and medicine, a difference that I didn't perceive fully twenty years ago, the reason why I opted for maths/physics rather than biology in my formal education), MIT's 7.QBW (a great, though frustrating, glimpse of what computational biology can be, which motivated me to try again Mt Sinai's Intro to Systems Biology in September), ANU's second Astrophysics course, this time about Exoplanets (more because it's fun and stimulating than because the co-instructor has a Nobel, this time), and the last part of Berkeley's Stats program.

At the end of JULY, 7.QBW is finished, Genomic Medicine is in its dying throes, Exoplanets is rolling along, and I've just started U. Illinois' Emergence of Life, which feels… I don't know, haphazard? Anyway, it's as good an introduction to evolutionary biology as I'll get in the summer − MOOCs are going slow until September.

So… What next? Well, I'm registered for a whole bunch of courses. First, I'm trying out the UK's own FutureLearn platform to learn about the Scottish independence referendum (my grandfather was Scottish, so I feel kind of romantically attached to the land of Ayes and Scotch, although I've only ever been there as a tourist). The course straddles the referendum itself, from Aug 25th to the end of September, so we'll get to learn about the issues then about the aftermath (although it's pretty clear the No will win). I don't expect this course to take up much of my time.

Much more seriously, come SEPTEMBER I'll be taking:

  • Data Analysis and Inference from Duke University - a more practical (with R) overview of statistics. Not very high on the priority list, more like a way to keep the stats knowledge warm.
  • Physiology from Duke University also - something I intend to put quite a lot of hours into. Medicine without the practice-of-medicine angle, just the thing for me. (For the summer I've downloaded a 1500-page textbook on Physiology from OpenStax, in order to get a heads-up).
  • Exploring Neural Data from Brown. A follow-up on MIT's Quantitative Biology, this time focused on neurology and Python. I have high hopes for this one.
  • Introduction to Systems Biology from Mount Sinai. Hopefully this time I'll be able to follow the professor along; I'm still convinced the subject is very interesting, despite the professor being, hmm, clearly a much better scientist than teacher.
  • Explore Statistics with R, from Karolinska Institutet. Just because I want to hear Swedish accents again − and okay, because it claims to teach where to get good healthcare-related data too. I guess I'll skip the parts about learning R.
  • Introduction to Dinosaur Paleobiology (Dino101) from Alberta University. I'm a 37-year-old kid, so what? Anyway, everybody says the course is enjoyable but low-intensity, which is fine with me, what with all the other courses at the same time.

In OCTOBER most of these courses will still be running (all except Scottish Independence and Explore Stats) but I'll still be starting Delft's Engineering for Bio-based products, because I feel it's my continental duty to pick up some European courses (what do you mean, I'm not convincing?) and because it's somewhat intriguing. Also, by the end of the month starts the second part of Rice's Fundamentals of Immunology, that I'm pretty committed to seeing through (I've done the first part and I hate leaving things halfway done).

Sometime in Q3 2014 (whatever that means) starts Harvard's Muscoloskeletal Anatomy − a good complement to Duke's Physiology, with virtual labs (including dissections). Also, if I see the Systems Biology through, then I'll probably embark on Experimental Methods in Systems Biology and the rest of the specialization.

So, that promises to be quite an eventful second half of the year. We'll see how it turns out!

On a side note, I'm starting to wonder about turning all this knowledge into an actual degree and a career change. There seems to be options with the CNAM (France's continuing education university-like institution, aimed at working professionals) so I'll be contacting them in September to study things through. Paradoxically enough, if I do end up taking evening classes there, it'll be because of MOOCs − but it'll also mean I can't take MOOCs anymore for time reasons. Oh well, that's all very hypothetical. We'll see!

Monday, July 21, 2014

Quantitative biology workshop, MIT

MIT's 7.QBW's winding down. I'm not sure what to think about it.

On the one hand, it's nice to get to grips with some biological problems with the tools actual computational biologists use. As usual, MIT do things seriously, the syllabus is impressive, the lectures are great and stimulating, etc.

On the other hand, it's very frustrating − we only get to lightly touch on to some very basic concepts. I can't really say I've learned anything of substance; most of the “workshops” have been, by necessity, hobbled by the idea that primarily biologists would take the course, with little to no competence in computer programming. And so, to review a number of languages and tools in a very short time, everything is kept very basic, very introductory. Only once so far was there a really interesting problem (the MATLAB one on neurological analysis).

A quick trawl through the forums show that basically there are two populations − the biologists who find the programming assignments very hard, and the programmers who breeze through the course and are frustrated not to do more biology.

I guess the course's ambition makes it a bit of a tightrope exercise; maybe the MOOC format isn't so well suited to that kind of course? I guess people would have found it more satisfying if the course had been either an applied programming crash course for biologists, or a biology application aimed at software people. Then, by excluding one half of the prospective population, they could have gone deeply enough to teach actual skills and knowledge to the other half.

On the third hand, by applying standard expectations, we are reading this wrong. It's not a “course” and it's not meant to teach skills'n'knowledge. It's a workshop, based on an outreach program. It's aimed at giving people a glimpse of what can be done in the field of computational biology − and in that respect, I'd say it's pretty well hit the nail on the head.

But then again, it's pretty frustrating to get a glimpse of something cool and not have any way to reach towards it. If MIT created a systems biology course or sequence of courses, then this workshop would be the best possible introduction to it. On its own, it's kinda… unfinished.

PS oh, yeah, of course, I'm going to ace the course. But for the reasons explained above, I have no real merit in it.

Update 2014-08-02: yay, here's my certificate:
(Yes, I shelled out for the Verified cert, because it goes some way towards contributing to edX and MIT, and mostly because there are rumours that MIT Biology are considering creating an XSeries − then have verified certs in likely courses is a good way to get a leg up.)

Sunday, July 6, 2014

Summer is coming

Traditionally, summertime is break-time; students go home or travel to see the world or do summer jobs, that kind of thing. Traditions are thick-skinned and find themselves replicated in the world of MOOCs, where they don't have much sense (apart from letting staff take summer break too, I suppose).

So, let's round up:


  • Introduction to Statistics is finished. I eventually got 89% in the first course, 65% in the second course (that I started on the one-but-last week and caught up), and 96% in the final course on Inference. That doesn't make a statistician of me, but I feel armed to use simple statistics to read the methods part of papers, for instance, which was approximately my ambition in starting the cycle. Overall, maybe the 15 weeks of the course could have been condensed into 10 or even fewer; and maybe doing one big course instead of three small ones would've been better. But I'm not really complaining.
  • I'm down to three MOOCs now: Exoplanets, Genomic Medicine, and Quantitative Biology.
  • Exoplanets (Astrophysics course number 2 from ANU) is great, I love the enthusiasm of the professors and the mysterious planet Mog.
  • Genomic Medicine Gets Personal from Georgetown University is… strange. It's got loads of hand-holding, each video lecture is accompanied by reams of annoying text telling us what the pedagogical objectives are, etc.; the difficulty level is also very low and I can't really say I've been learning much actual science so far. But on the other hand, it's quite interesting to hear doctors explain case studies, and relate how technological progress really impacts their job.
  • Quantitative Biology Workshop is great, and has rekindled my interest in systems biology. The course suffers from its format: it's nearly impossible to do more than simple introductory work in 6 weeks while covering as many different techniques and tools. But still: after this week's visual neuroscience exercise, I was quite happy to have achieved it, and starting to like MATLAB. I really hope that the hints dropped that MIT are considering a Biology Xseries turn out to be correct: so far their courses have been stellar.


Next week it's just these three MOOCs, then (just in time for my one-week holiday in Wales) I'm starting a Coursera MOOC on The Emergence of Life which sounds great. That's the whole of the summer pretty much accounted for; the serious stuff begins again in September, with Scottish Independence, Data Analysis and Inference, Physiology, Neural Data Analysis, Dino 101, Explore Statistics with R all starting together − and I'm reconsidering taking Systems Biology at Mount Sinai too (with 7.QBW under my belt, it should be easier). I won't be able to do all of them justice, and some will have to go: I think KI's R course is currently the most likely to be dropped, but it's not likely to be enough. I think either one of Data Analysis and Neural Data, or one of Physiology and Systems Biology, or both, will have to go. But we'll see; there's no harm (or indeed, cost) in trying all of them for a week or so, then picking the ones I like most.

Wednesday, July 2, 2014

Human sacrifice as a scientific technique

Actual test-yourself question from Astrophysics 2 - Exoplanets:


Which only proves the vital necessity for observatories of procuring a regular supply of fresh grad students.

That said, it's a nice illustration of why I like this course: there's serious stuff in there, but they don't forget their sense of humour. Another question asked whether the vibrations caused by nearby kangaroos jumping up and down might cause blurred spectral lines…

Friday, June 27, 2014

Planning update

So, I've updated the Completed, Current, and Upcoming courses pages.

Not much to say, except it's noteworthy that my upcoming planning is heavily Courserized. My beginnings in the MOOC world were through edX, and I still prefer its linear approach (resources are arranged in chronological order, while on Coursera they're arranged by resource type) and richer exercise types (formula entry, molecule editors, etc. go well beyond Coursera's simple quizzes and peer-reviewed essays). But while the first courses I've taken from Coursera have been a bit disappointing, I've take a couple great ones since, and after a while I've warmed up to the platform (and its much, much better forum software). Although in reality, what's importance is the vastness and breadth of the course catalog: Coursera's is what, five or six times larger than edX? So, being interested in a somewhat limited range of topics (bio, data analysis, comp. sci, with occasional forays into economics) I'm finding that I've, well, “exhausted the edX catalog” isn't the right term, but conveys the right impression − what I mean is that there are many more courses on Coursera that I'm interested in, and that's purely an artefact of numbers.

Finished: Epigenetic Control of Gene Expression

I'll write something later, but just a quick note: I passed Epigenetic Control of Gene Expression from Melbourne University through Coursera. I guess it's the highest-level biology course I've taken so far (the professor says it's about either final-year undergraduate or first-year graduate). My final score should be 97% or thereabouts, so I guess that means I mastered the content pretty well.

Saturday, June 21, 2014

So, I…


  • … upgraded to Verified student in MIT's Quantitative Biology Workshop (hints have been dropped that MIT is considering a Biology XSeries, and that taking verified certs now means taking a head start),
  • … registered for TU Delft's Technology for Bio-Based Products,
  • … and for the University of Illinois' Emergence of Life,
  • … and for Duke's Introductory Human Physiology,
  • … and for the University of Alberta's Dinosaur Paleobiology.
And so, my schedule is full until November, approximately.

Friday, June 20, 2014

Why I dislike "guided discussion"

Some course designers feel a need to ask students to discuss course-related topics. For instance, at the end of a segment, they'll write “What do you think about genetics? Please discuss this in the forums.”

This kind of open-ended discussion works fine in smaller, physical classrooms, where only one person may speak at a time and everyone must listen to everything that is being said. In such situations, a real discussion may happen and insights may percolate.

In online classes of hundreds of thousands, you get…

… noise. Hundreds of single-message threads, never leaving the realm of platitude.

Course designers: please, please, please do not give in to the temptation of using “guided discussion”. Given the present state of technology, it only renders the whole course forum useless.

Tuesday, June 17, 2014

I can MongoDB

And so I passed MongoDB's Advanced Operations and Deployment course. My score is 100%, but in fairness it doesn't mean much: it's one of these courses where either you get it, or you don't; there's little place for a middle ground.

Still, it's nice to ace a course.


Sunday, June 15, 2014

On the radar

Here are a few courses I've noticed but haven't committed to yet (and couldn't possibly commit to all):

edX




Behavioural Medicine: a key to better health from the Karolinska Institute

English Grammar and Style from the University of Queensland


Transforming Business, Society and Self from MIT (because this sounds weird yet comes from MIT)

Coursera

Programmed cell death from LMU München

Emergence of Life from U. Illinois

Introductory human physiology from Duke University

Introduction to Systems Biology from Mount Sinai (I tried that one earlier in the year, and dropped it because I was too tired, too busy, and the course was too demanding. I am still interested in the whole Systems Biology specialization, but am more than a bit put off by the, erm, distinctive teaching style of this first course.)

Dinosaur Paleobiology from U. Alberta (yeah, in my heart I'm still a little boy with a fascination for ancient beasts reptiles creatures non-avian dinosaurs.)

Drug discovery, development and commercialization from UC San Diego

Social Network Analysis from U. Michigan (which uses Logo! Turtles all the way!!)

Climate Change from the University of Melbourne

FutureLearn

Hadrian's Wall: Life on the Roman Frontier from Newcastle University (where else?)




I will not, obviously, take all of them (the number of hours in a given week being, sadly, of limited elasticity). The ones I'm not likely to take, right now, are Biobased products at Delft and Emergence of Life at UI. I'll probably take one of the other of Physiology and Dinosaurs, too.

Saturday, June 14, 2014

End-of-course blues

So… I'm in the final throes of two courses now: Epigenetic Control of Gene Expression from Melbourne University (Coursera), and MongoDB Advanced Deployment and Operation (MongoDB). In addition, ASTRO1 from ANU is also officially closing in a few days − I already have my final score for ASTRO1, in contrast to the other two, and anyway ASTRO2 is starting in a couple of weeks.

I always feel a bit bluesy when a course ends. I guess that's because, well, it's the end: all of the time and effort you've invested in the course have born their primary fruit (some additional knowledge or skills, maybe a bit of paper to prove it), but − unlike in a “real” course − you don't go on hanging out with classmates, you won't be bumping into the professors in hallways, etc.; finishing each course is in that respect more like graduating: the end of something.

Then of course, when I've made time for a 7, 10, or 15-week course in my schedule, when the course ends, there's a hole to fill in − things can feel a bit empty at first.

So, onwards! To recap, in the first half of this year I've completed the following courses:

  • Economics and Calculus at Caltech
  • Statistics part 1 and 2 at Berkeley
  • Immunology part 1 at Rice
  • Analytics at MIT
  • Bioinformatics at Peking University
  • Astrophysics part 1 at ANU
  • Diabetology at Copenhagen University
  • Epigenetics at Melbourne University
  • MongoDB administration (both courses) at MongoDB
(Okay, so Epigenetics isn't quite over − I still have to grade my peers' assignments − and it is possible that I utterly failed the final essay and therefore the course; but that's very unlikely.)

Besides, I am now enrolled in the following:
  • Statistics part 3 at Berkeley
  • Genomic Medicine at Georgetown
  • Quantitative Biology at MIT
I may drop the Genomic Medicine course − it's interesting, but it feels way too basic, especially after the rather challenging Diabetes and Epigenetics courses. Of course, I have now a bit of time on my hands, so may stick to it just to get busy.

Later on, I am pencilled in a few additional courses: ASTRO2 (Exoplanets) from ANU, two analytics/statistics courses (one focusing on healthcare data from the Karolinska, and a more general one from Duke), a couple of medical courses (Immunology Part 2 from Rice, and Anatomy from Harvard), and a discussion about Scottish Independence from the University of Edinburgh (obviously centered on the referendum date so as to discuss the issues as well as the aftermath).

So − that's eight more courses I'm either currently doing or scheduled to take before the year's out, bringing the grand total of 2014 to a whopping 19. I am quite certain I'll find an additional course to take it up to 20, which when you add the 5 certificates I got in 2013 makes for a not completely awful curriculum.

Wednesday, June 11, 2014

Worth Sticking To? Genomic Medicine Gets Personal, Georgetown University

We're now in Week Two of this course, and getting to grips with the actual content. As a reminder, this course from Georgetown University is about the transformative aspects of “genomics” as a whole with regards to the theory and practice of medicine.

In other words, while I've been studying basic biology, bioinformatics, epigenetics, etc., from a purely intellectual point of view, this course takes a different approach, grounded on the experiences of doctors counseling actual patients. Which is all and well, only… well, the actual science is too basic. I am not certain I can actually be bothered to listen to people explain that there are 22 pairs of autosomes, or what a mutation is. So while the point of view is certainly interesting, at this stage I am not quite certain it is enough to keep me interested. We'll see; but in the meantime, this course is in the uncomfortable position of the one I am most likely to drop.

Starting: 7.QBWx Quantitative Biology Workshop

Let's see… 7.00x Introduction to Biology got me hooked on to MOOCs. I've been assiduously taking courses in biology / medicine ever since, as well as computing courses and bioinformatics ones. So, when 7.QBWx was announced − a hands-on, workshop-oriented, introduction to the realm of “quantitative biology” (i.e. bridging the gap between data scientists and biologists), from the great people at MIT (were it not for the unfortunate Social Physics Buy-My-Book-Please-But-Let's-Pretend-It's-A-Course, I'd believe that MIT is the place where the exceptionally deserving go to when they die) − well, what could I do − except end this rambling sentence?

While the official start date was yesterday, the course hasn't really started − we've only been given a handful of introductory tasks to complete, basically checking that we could install all the necessary software on our computers. The real stuff begins next week. In the meantime, my freshly-updated Linux Mint is now equipped with R, Canopy Python (on top of system Python), Octave, and PyMOL.

I am slightly apprehensive: how can a 6-week workshop covering so many different tools and languages actually achieve any significant learning outcomes? The only way I can see is to make the course pretty hard − I'm not worried: by now I have a decent enough biology background, and of course I've been a professional programmer for 15 years, so I doubt I'll struggle in this course − which is my worry, really: if I don't struggle, how am I going to actually learn stuff?

Well, we'll see. I am fairly confident the great people at MIT (where the exceptionally deserving, etc.) have assembled a challenging, great course, and am eager to see what it comes to.

On a side note, a staff member (pseudonymed TurtlesAllTheWayDown, which − as a long-time Pratchett fan − I find absolutely wonderful) has admitted that an MIT Biology XSeries is at least being discussed. I am incredibly excited about this − and will therefore probably upgrade to Verified Certificate after next week, if it appears the course is really worthwhile, so as to make it count towards the future, possible XSeries.

Monday, June 9, 2014

Reading scientific papers

The ubiquitin multi-motif protein UHRF1 is a central player in targeting repressive chromatin marks. It contains a SRA domain, which binds to hemimethylated DNA, a Tudor domain binding to methylated H3 (H3K9me3) and a PHD finger interacting with an unmodified arginine residue within H3 (H3R2) [72–74]. Furthermore, UHRF1 interacts with DNMTs, G9a and HDAC1 and thereby unites various enzymes that can provide a repressive chromatin environment [75–77]. Interestingly, UHRF1 also recruits the H2AK5 actetyltransferase TiP60 thus integrating a multitude of different epigenetic signals [78]. A further example for the link between DNA methylation and histone modifications represent methyl C binding proteins such as MeCP2, which interact with co-repressor complexes including HDACs and HMTs [79,80]. Interestingly, a recent report shows that components of the piRNA pathway are required to target de novo DNA methylation to an imprinted region of the mouse genome implicating that selective methylation of imprinted regions can be regulated by non-coding piRNAs [81].

Imagine 21 pages of this. Okay, 9, if you discount the bibliography. That's the review article we're supposed to read (and understand) for the final part of the Epigenetic Control of Gene Expression MOOC at Melbourne University.

Well… I'm finding it a lot of work to read the text, parse the sentences, expand the abbreviations, and finally integrate it into a whole. I suppose it's mostly due to a lack of training − if I were to read that kind of paper on a daily basis, I guess I would acquire some paper-reading skills. That, or my brain would overheat and melt (and believe me, it's embarrassing to have to mop up your own liquefied brain matter that's dripped through your ear ducts to the carpet.)

I guess I'll have deserved that certificate.

Saturday, May 31, 2014

Pro training vs academical MOOCs

I've touched on it slightly before (mostly in off-the-cuff remarks about how Udacity seems to reorient itself towards professional training); I find it rather fascinating how different “academical” MOOCs feel from “professional” ones.

Let's introduce our terms, first. I'm using “academical” to qualify courses given by a teaching institution, with an outward goal of teaching fundamental concepts and practices, sometimes through the use of a given product. “Professional” courses are provided by companies and aim at giving immediately useable skills focused on a specific product. For instance: “Introduction to Biology, the Secret of Life” from MIT is a academical course; “Introduction to Hadoop and MapReduce” by Cloudera through Udacity is professional training.

Sometimes the distinction is a bit blurry, as some “academical” courses steer very close to “pro” − I'm thinking of UC Berkeley's “Software as a Service” course, which could very well be renamed “Introduction to Ruby on Rails”. That's not a bad thing in itself: I am definitely not arguing that “professional” training is somehow “lesser than” academical courses. Keeping oneself updated on specific products and techniques is very important in intellectual professions.

I am currently following a “professional” course from MongoDB, Inc.: “Advanced deployment and operations”, focused − obviously − on the intricacies of deploying and administering MongoDB instances. This course uses the OpenEdX platform, in a more or less classical way: video lectures, quick questions, then homework. Some of the homework uses locally-installed software that evaluates hands-on manipulations, and that's one of the great things about this course. Overall the course is great, the lectures are clear and detailed, and indeed quite advanced.

And then, there are the discussion forums. To put it bluntly… a lot of “pro” training students are, I don't know, hopeless? How can they hope to complete an advanced course on database operations when they obviously can't be bothered to do more than cut-and-paste commands without trying to understand what they're doing − and then go whine on the forums? This week I've been trying to help out − after all, not everybody who's taking the course is supposed to be proficient at using Linux (which is the platform used for the homework). But I find that a lot of people don't even understand the concepts of host names, TCP ports, or the difference between one and three. (Yes, the numbers.) And then, when things don't work, they post angry comments on the forums following the lines of “the instructions are crappy, I followed them and it gives me an error!”

I really admire the patience of the TAs.

What I find slightly disquieting is that, through help obtained on the forums (it seems the “do not post full answers” / “do not ask for homework answers”), these people will complete the course, and they will get the certificate, and they will be hopeless nonetheless. I guess I should be happy, this means I'll have no problem finding employment fixing the obvious mistakes others have made… but seriously, it means that people will be handed the keys to big databases when they really shouldn't be allowed anywhere near a root account.

Anyway. Rant off.

What I find interesting is that, by and large, we don't get the same clueless types in academical MOOCs. Oh, we do get a handful of 15-year-olds with more enthusiasm than understanding of the underlying ideas, and we do get people complaining about homework that's too difficult or “unfair”, but not nearly as many, and they are not nearly so obnoxious. And that, indeed, also applies to borderline courses − I just went back to check the forums for the Berkeley Rails course: the overall level is much higher. The course homework was to be done on a Linux virtual machine: there were overall very few issues with using that.

Now why is that? Why does a course titled “MongoDB advanced deployment and operations”, explicitly aimed at people with significant systems experience, attract so many people who lack the very basics, while a course called “Software as a Service” aimed at undergraduates, mostly gets clued-up students? I'm guessing that, in essence, a “pro training” course, being more concrete, attracts students a priori less at ease with more abstract topics. Also, pro training is more immediately useful, it is something that can be boasted about on a resume. I guess the employment value of “I completed both modules of MongoDB DBA courses” is greater than “I completed UC Berkeley's overview of building software as a service” (although, having done both courses, I'd rank them equally − actually, the Berkeley homework being harder, I'd prefer candidates with that on their CVs. But most employers will not have done all the courses.)

Of course, the a priori less technical MOOCs also get their share of less qualified people. Based on the discussion forums, I'd say a number of students who took the Copenhagen Diabetes course weren't very well armed to deal with that kind of advanced matter − but still, while I didn't use the forums as much, I didn't find them plagued with so many whines and complaints. I guess it's because the people there had a real willingness to learn, as opposed to rack up a certificate to boost up their career prospects.

Or something. I don't know, really.

I'll just avoid the MongoDB course's forums, I guess.

Thursday, May 29, 2014

More certificates!

Yesterday I've collected my certificates / statements of accomplishment for 15.071x The Analytics Edge, and Diabetes: A Global Challenge.





So that's a round 12 certificates obtained, with two more on the way (Astro and Probabilities). Yay, I guess (says the guy in a mock-blasé tone of voice.)

Notes from the trenches: Epigenetics at Melbourne

[ed: it's been a while since I've posted, mostly because not much happened.]

While it's officially Week 5 in a 7-week course, I'm actually well into Week 6 (weekly content is released a week in advance), with only one quiz to take − I expect to do it later today − then it's down to the dreaded peer-reviewed essay.

All this to say that I feel capable of discussing the course.

(to digress, there's something I thought about yesterday: MOOCs are supposed to be “communities” for students, but they are transient − lasting only a handful of weeks. For the same reason it's hard to be able to discuss a course in general terms based on only a couple of weeks' content, it's difficult to get to know people − fellow students − in such a short time. I wonder if multiple-course sequences, like the ASTROx year-long series at ANU, will actually foster such a “community”.)

Anyway − epigenetics is the study of the process through which genes are stably turned on or off. The keyword here is “stably” − while a given cell will express genes differently at various times (for instance insulin secretion is turned up when glucose enters the pancreatic beta-cells), this is not an epigenetic change; rather, some genes are permanently turned off (or on) for the whole life of the cell, barring exceptional circumstances, and this state is preserved when the cell multiplies.

Something which the course makes very clear is that this is a field under active research. In other words, there is very little that we know for certain − some processes are well-understood, but most are not, and a large part of what is thought is “very controversial”, that is to say, researchers disagree strongly on how the mechanisms work, and even on whether they actually exist in the first place. So that's pretty exciting, if somewhat confusing: one doesn't quite expect to walk into a classroom and be told “okay, so we think there is some epigenetics here, but we're not sure, and we don't really know how it works anyway, so if you're thinking of doing some research of your own in the future, that's not a bad place to start.”

In keeping with this bleeding-edge focus, the course places a strong emphasis on reading scientific papers, over at PMC or PLOS − we're actually quizzed on the papers. In a way, the video lectures are only an introduction, the real meat of the course being the papers. That's the hardest part for me: reading and understanding jargon-laden, dry papers is a specific skill that I, erm, need to work on (I find my eyes glaze over pretty quickly). It's also not an activity that can easily fit into my normal MOOCing times (on the bus/train? No way, requires much more concentration − in the evenings? Nope, requires a freshness of mind that I just don't have after a full day's work − ideally I'd get up an hour earlier and read during breakfast, but er… I value sleep, too).

In terms of content: the first few weeks are about the well-understood mechanisms (DNA methylation, chromatin structure and histone modifications, X chromosome inactivation, epigenetic reprogramming), then we get on to more controversial topics (environmental disruption of epigenetic state, for instance the effects of tobacco smoke, or diet, at crucial periods of time). The last week of the course is about cancer, which is pretty interesting (epigenetic modifications are one of the hallmarks of cancer − it's actually one of the very few common points of all cancer types: not knowing anything about the subject, I'll refrain from qualifying cancer as an “epigenetic disease”, but it's certainly tempting.)

The lectures are basically slides with an embedded shot of the lecturer (Marnie Blewitt from Melbourne University) in the corner. While I usually dislike this format, here it works well, partly because Dr Blewitt is a great speaker with a clear voice, but mostly because the slides are only outlines / supporting material for the course itself. In fact I find I hardly read the slides − I just skim them and concentrate instead on the audio.

Every week there's a quiz, which is fairly difficult − you have three tries, but the questions change between each try. As I said, the quizzes are often about specific points raised not in the lecture, but in the required readings, forcing us to read the papers, not a bad thing. At the end of the course, there is a peer-reviewed essay; I'm not certain how it will play out, it's been ages since I wrote essays (and then again, never in a scientific subject). I'll keep you posted about how it turns out.

So, in general, I like this course quite a lot, and it's certainly given me a lot of things to think about.

Thursday, May 22, 2014

I guess I should have studied Astrophysics…

… as my final grade for the ANU ASTRO1x course is, well, reasonably good.


(That said, the grading scheme was rather lenient. While I do know a lot more about astronomy than I used to, I don't really deserve a 100% score. But still, it's always nice to get good grades.)

Sunday, May 18, 2014

Notes towards an edX / Coursera comparison

A I wrote elsewhere, three major platforms seem to occupy much of the MOOC landscape: Udacity, Coursera, and edX. I've been meaning to do a write-up of my various impressions of the platforms based on the courses I've taken or am taking (counting the ones I've dropped, that's something like 15 on edX-based platforms, 6 on Coursera, and 1 on Udacity). Doing this write-up will take a while though, because I want it to be as good as I can make it; which will probably mean taking more courses, too.

In the meantime, I want to gather my thoughts, and sort of draft the main ideas I want to include in the final write-up.

Introducing the players

I'll mostly talk about edX and Coursera; I haven't really got enough to say about Udacity, and the model is different enough that it's not too easily compared to the other two. A few words of introduction though:

Coursera and Udacity are both commercial companies tracing their roots to Stanford University in California. Coursera has taken most of the limelight, gathering $85 million in funding and a very wide portfolio, over 600 courses, usually (but not exclusively) online versions of courses offered by various universities around the globe. Since they are commercial startups, a lot of the discussion around them revolve around business plans, licencing agreements, etc. Their models are different: Coursera is closer to classical university courses, with a predefined schedule, lectures, homework, etc. Udacity courses are offered on a subscription basis: you can take them whenever you like and take however much time you need, with a monthly fee.

In contrast, edX is a nonprofit organization. Its origins are at MIT, which was quickly joined by Harvard. It is governed by the edX Consortium, comprising the two founding institutions plus a large number of organizations (mostly, but not only, academicals).
Courses at edX are mostly organized in the same way as Coursera's, with predefined schedules and so on. A few courses are available on a self-paced basis, though.
Note that “edX” can have multiple meanings: either it is the edX organization itself, or the edX software that powers the platform. The latter is free and open source, and in wide use by multiple organizations, such as France Université Numérique (the French MOOC platform) or MongoDB University. Interestingly, it seems that Google is a contributor to the OpenEdX platform, and that Stanford (from where both Coursera and Udacity have originated) has also put its weight behind it.

Institutions

Coursera has by far the widest course catalog, more than 600, from many places around the world. They seem to have a great deal of partners in the US state university systems, in Australia, in China, etc. but also from Europe (hello, Centrale Paris!), which is a feat. Partners are mostly universities, but there are a handful of non-universities as well, like the World Bank and the National Geographic society.
Topics covered by the courses are extremely varied, and range from advanced mathematics to “do you have what it takes to become a vet?” Chances are good that, if you are interested in a topic, you will find a relevant course on Coursera.

Udacity seems to have refocused on hands-on professional training built in partnership with companies (Google, Facebook, etc. have all classes on Udacity), although there are a few university-sponsored courses too.

edX courses come from its partners, first and foremost MIT, Harvard and UC Berkeley. Presently, most courses seem to come from American institutions, although that's changing, with courses from India, Latin America, Japan, Australia, coming on. Europeans are lagging a bit: the edX powerhouses seem to be TU Delft (Netherlands) and Université Catholique de Louvain (Belgium). Overall, I think it's something like 175 courses that are offered on edX; generally, you will find introductory courses in about any subject, but more advanced courses are sometimes lacking (unless you're in IT).

Software

Coursera's software is very polished. It is quite obvious where all the funding has gone: the platform is very easy to use, it's simple to browse through the catalog to find interesting courses, one can “watch” courses for upcoming sessions, the platform itself does some simple analytics to suggest courses, etc. There are even mobile applications for iOS and Android which allow students to download lectures and watch them at their leisure, something which is extremely handy for people like me who watch lectures while commuting to work.

Course delivery is also quite polished, it's fast, there are few or no bugs, but I kind of dislike the layout. It's organized by activity type, so you have separate menu entries for lectures, quizzes, peer-reviewed assignments, surveys, etc.; each of these pages will have content added to it as course segments are released. On the one hand, students are free to organize themselves, although it seems to me that they'll just go through the elements sequentially (first watch all the lectures in one go, then browse the required readings, then take the quizzes, etc.) On the other hand, it's messier, I feel, and it makes it harder to keep track of what's to be done at which time, and in what order.

The course dashboard helps: some courses have counters on it recapping activities (“you have watched 16 out of 23 lectures”). Interestingly, the latest/most active forum threads are summarized on the dashboard too, so it feels more like the “course hub” that it's supposed to be.

Course activities seem to be limited to lectures, readings, quizzes and peer-reviewed assignments. It may be because I didn't take too many courses − I suppose programming courses have support for sandboxes, for instance. But generally, I find the “work” part of Coursera courses rather disappointing: it's often limited to checking boxes in quizzes.

By contrast, edX feels a lot more open-sourcey[1], very flexible but rough at the edges. It may be that they just haven't had as much money to invest in the software as Coursera have; being a non-profit funded by universities, they're unlikely to have $85 million to spend. So the “home site” experience is far from as good. Browsing through the catalog is a mostly-manual chore (there's no search engine, just categories). The student dashboard is a shambles. You have to keep track of what courses start when manually (I use a spreadsheet…), same for deadlines.

The course experience however is a lot better. The courseware is organized linearly, mixing blocks of various nature. So you can, and do, have lecture sections interspersed with knowledge-checks, followed by worked examples, practice problems (which can be of any type), then homework assignments. Lectures are still lectures and so, a mostly passive experience, but the homework can be basically anything: quizzes, formula input (with a nice MathJax integration), advanced interactive tools (7.00x had us cross fruit flies in order to produce populations with a specific phenotype), code execution (CS169x ran unit tests on our code to check its compliance), etc. This makes for a much, much richer learning experience.

The edX forum software sucks big time, but it has a redeeming quality: it's possible for course designers to embed discussion forums in courseware pages, thereby fostering discussion of the topic under scrutiny.

The overall feeling that I have is that course designers have a lot more latitude with edX, when Coursera has more of a “one size doesn't quite fit all” approach. That, and the fact that edX courses tend to come from higher-profile institutions (MIT, Harvard, Berkeley…) that may have more effort to invest in a particular course, means that as a rule of thumb, I go to edX first and only fall back to Coursera when I can't find what I need / want on edX.


[1] disclaimer: I am a big open-source user and advocate. I'm typing this from a Linux Mint desktop.

Saturday, May 17, 2014

Stat2.2X is passed!

So, I can do Probabilities.


To be honest, I didn't think I'd get this one − I mean, I joined on the third week of a five-week course (so I'd missed 40% of the course); worse, there was a midterm on the second week which I'd already missed. I registered anyway, because I wanted to learn about the subject (I am still convinced that statistics and probabilities are the most important maths one should learn at school after arithmetic) and because I intend to take 2.3X, “Inference”.

But thanks to the grading scheme (50% of the grade is contributed by the final alone, plus only the four best homework assignments, out of five, count towards the grade) I made it. Since edX have adopted the policy of handing out certificates to everyone, that's my, let's see, 11th edX certificate secured.

(I made a grand total of 2 errors in the course; both of which I could have avoided if I'd paid a little attention.)

Wednesday, May 14, 2014

EdX's enrollment options

At edX, when you enroll on a course, you pick your track between two options… although there are often, in reality, three. Let's review them:

  • “Audit” is the most basic. Gives you access to the courseware, the exams, etc. but you don't get a certificate of completion should you “pass” the course − what it means is, you don't feel the pressure to do exams (which can actually be detrimental to the learning process). It's free.
  • “Honour code” (or “Honor” as they write it, but hey, I'm European and half-Brit) is the basic certificate-granting level. You do everything, then if you do well you get a nice little PDF saying that someone purporting to be you completed the course. It's also free.
  • “Verified Certificate” means that you send official proof of your identity (in fact you hold up your passport to your webcam, you take a picture of yourself with the same webcam, someone at edX matches the pictures. You may have to take additional pictures at random times during the course to prove it's still you, but that's not happened to me yet.) So the PDF now says that someone took the course and they did check it was you. There's a fee involved, which is decided on a per-course basis. The cheaper courses start at $25 a pop, and I've seen courses asking for ten times that. Besides, that's a minimum fee − remember, edX is a nonprofit. When you pick the Verified Certificate track, you can pay whatever you like as long as it's above the minimal fee.

Not all courses offer Verified certificates; not all offer Honour code certs either, but nearly all do so we won't mention U. Washington's “deal with stress” course.

Now when you sign up for a course that does offer the Verified option, you get a choice between “Audit” (free) and “Verified” (paid). But what a lot of people don't realize is that the Honour code option is still available! Just pick “Verified”, but instead of choosing an amount to pay, click the checkbox saying “What if I can't pay? Choose an Honour code certificate instead.”

So that's how it works. Or worked.

A couple of days ago, edX rolled out a release including a number of improvements, including a Dashboard update that now states clearly what track you're on for each course. So it became very clear to a lot of people that they were “auditing” courses − and that they wouldn't get the coveted certificate after all. A couple of days' worth of grumbling, and voilà: I saw in two places (including a Course Info note by the actual instructor) mention of “a change in policy”: auditing students will get certificates, too.

So, dunno if that's due to the uproar, or if the whole three-level registration is too complex anyway. It may be the end of the distinction between Audit and Honour Code.

It's also a case of convergence towards the model of the competition: Coursera has only two levels, “join for free” and “Signature Track”. You get certificates (sorry, Statements of Accomplishments) either case, but only by signing on the paid-for Signature Track is your certificate verifiable online (edX provides authenticity validation to all − so if you see an edX cert, you know that someone really took the course; you can't be certain who. If you see a Coursera statement, you only have an easily-tampered-with PDF that proves nothing.)

So, what to make of it? Well, I didn't really see the point of Audit anyway (maybe, I thought, if you don't go through with the homework and all but only watch the lectures, then you don't get the big bold “you only scored 9% on this course, while 85% is required for a pass” message on the archived course on the dashboard). So, kind of glad if that's going away.

In the short term, it means I still have a good chance of clinching that Stat 2.2 certificate (though I'm auditing the course, having joined late). So I'll be trying the final this weekend after all.