Motivation in learning

Suppose you’re designing a learning tool and you want to amp up the motivation. You decide to show a graph of the user’s learning progress. Of course on your awesome learning environment, people will be learning all the time, so it’s going to look like this, right? Users will see that they are getting more and more awesome, they’ll feel awesome, and they’ll come back every day to keep learning.

Screenshot 2014-10-15 09.49.49

The problem is, when learning looks like this, the learner is already well aware that they are kicking ass. Your graph is the banner at an election party. Maybe it ties together the scene, but everyone already knows what’s going on.

The reason that motivation is a persistent unsolved problem in education is that learning doesn’t look like that. Learning is filled with plateaus and pits because confusion is the very nature of learning. Learning–in the very best case–looks more like this:

Screenshot 2014-10-15 09.39.20

Keep in mind those plateaus can be on the order of months such that we forget what a jump feels like. Which, by the way, happened so quickly and changed our thinking so rapidly that we barely noticed it!

Motivation hackers have countered with the theory of small wins: if we decrease the delay before some kind of reward, we will feel more motivated. But what does that really mean in the big picture–at least when it comes to learning? It means we are zooming into this graph and increasing the number of little upward bumps on the plateau. That is what spaced repetition is good at: keep increasing the frequency of missed items such that the correctness ratio remains around 90%. But our unconscious, in the end, can’t be tricked like that. Once we’re used to spaced repetition, we know that the missed cards are piling up, rather than the new ones we want to get to. We might feel the joy of a small win, but it will be paired with the pain of even more small losses. Moreover, we know that we just aren’t learning that much.

What about games? Given that games are so fun and addictive, many believe they hold the secret to education’s motivation problems. According to Raph Koster’s Theory of Fun, what makes game fun is…wait for it…learning! While games can sometimes teach educators about pacing, game designers have the luxury of not having to include anything with too long of a plateau. They get to choose the domain, but when we discuss learning as a more practical matter, that isn’t possible.

So you want to create instruction a domain and that contains concepts with long plateaus. Your best option has nothing to do with motivation but rather is to improve instruction such that the plateaus are shorter. Beyond that, I’m not too sure. I think it’s part of why “detachment from the illusions of self” is part of shuhari, a Japanese martial arts conception of mastery: one must get over the idea of that they need to be better all the time. In addition, learners need a deeply held belief both that what they are striving for is important (“when are we ever going to use this?”) and that the periods of stagnation are essential to growth. Maybe the graph to show, if you can do it convincingly, is the plateau another learner was on before achieving their next jump. And the cool stuff they did after a certain number of those jumps.

What I’m learning – 8/5/14

Learning How to Learn (MOOC, Coursera) Week 1 contains a good collection of topics. I’m familiar with most of them: spaced repetition, the benefits of sleep and exercise, the pomodoro technique. An interesting framing that they use is focused versus diffuse modes of the brain. I love Coursera’s mobile app for watching videos: they can be downloaded and watched at 2x speed.

Real World Haskell (Online book) I recently did CIS194: Introduction to Haskell, which was excellent for learning Haskell concepts but left me still confused about how to structure programs. This book is already teaching me a lot of practical tips that CIS194 didn’t cover (to be fair, they give RWH readings for each lecture). The embedded comments are a great way to see a variety of solutions for the exercises in the book. It’d be nice to have top quality solutions available too, but sometimes it helps to see the thoughts of another newbie.

Probabilistic Models of Cognition (Online book) I’ve been hugely interested in modeling cognition for many years. I neglected this book because seemed it’d be like too much of a rabbit hole to tackle. However, it so far turns out to be a great review of probability and functional programming (it uses Church, which derives from Scheme) in addition to the interesting domain. I really enjoy being able to modify and run programs in-line. There’s an element of feedback that is nearly effortless because I usually have an expectation of what a program does right before pressing “Run”. Then I immediately see whether that expectation was correct or I need to think more about it. There are also more traditional exercises that push harder but with the convenience of being in the browser.

Why Do Americans Stink at Math? (Article, NY Times) There is a ringing endorsement among those who are good at math: “don’t just memorize a procedure, understand the concept.” Unfortunately, it rarely goes beyond that platitude, and it starts to break down on closer examination: if you have an understanding, isn’t the concept memorized as well? Most likely, unless you have to reconstruct it very slowly, you’ve memorized the procedure too. So which really came first: your self-proclaimed “understanding” or an explanation that you constructed for the procedure that you memorized? The big reveal is to try to get most of them to actually explain a concept they understand to you. “Argh, well, you just do this.”

And yet, when you read an article like this, there is something obviously and dreadfully wrong with something like “Draw a division house, put ‘242’ on the inside and ‘16’ on the outside, etc.” An interesting counterexample is where math was learned by the uneducated in a way that is procedural but also embodied. That is, math was learned or used in commerce or factory work–but clearly still requires a long path to learn symbolically and abstractly (see also The Real Story Behind Story Problems). Another fascinating possibility for teachers using a Japanese technique called lesson study. A lot to digest in this article. (I have some more writing from my grad school days on concepts.)

Is Practice Dead?

According to a new study, “Deliberate practice is unquestionably important, but not nearly as important as proponents of the view have claimed.” Broken down by domain in a meta-analysis of previous research, deliberate practice explains only 26% (games), 21% (music), 18% (sports), 4% (education), or a minuscule <1% (professions) of differences in performance. The aim of this research isn’t to provide advice, but if you start to believe that practice isn’t that important or effective, you might not pursue it wholeheartedly. I’d like to argue that that’s a big mistake.

Let’s start with the “10,000 hour rule” that is always cited in articles about practice and performance. The standard view of this rule seems to conflate two useful ideas. The first idea is that expert-level performance in cognitive domains takes a great deal of cognitive work–we’ll see why. Call this the practice threshold hypothesis. The second idea is that the specific techniques used to practice make a big difference. Call this the practice quality hypothesis. The meta-analysis is conducted on studies that use the original definition of deliberate practice from Ericsson, Krampe, and Tesch-Römer, 1993, “effortful activities designed to optimize improvement.” Their definition captures neither key ideas about the cognitive work threshold or quality in practice.

The origin of 10,000 hours dates back at least to Simon & Barenfeld, 1969, where they discuss not hours but the size of a “vocabulary of familiar subpatterns” needed by chess masters and Japanese readers: 10,000 to 100,000. Just like reading in a foreign language won’t make sense if you don’t know key words (this is the best example I can find), it isn’t simply that “more practice is better” but that a large minimum threshold of practice is necessary for mastery. Obviously this amount is not exactly 10,000 hours. Chess can cover effectively endless board positions, so the figure is not an upper limit, it’s just that few people reach another major threshold beyond 10 years of practicing 20 hours per week, and those who do may be beyond the comprehension of mere masters. Or as Professor Lambeau says in Good Will Hunting, “It’s just a handful of people in the world who can tell the difference between you and me.”

To discredit the practice threshold hypothesis the meta-analysis would need to examine total accumulated practice that may be related to the domain. In fact there seems to be an inverse correlation between the variance explained per domain and the difficulty of measuring accumulated practice. Chess masters tend to have studied chess their entire lives, and musicians have played music (of some form) their entire lives. Sport skill can come from a bit wider range of physical training. Education and professions draw on a yet wider range of skills. A mathematician may make a “natural” programmer because of extensive experience with analytical thinking, but his math expertise doesn’t get counted as “practicing programming”.

Now let’s talk about practice quality. There isn’t a dominant theory of exactly what makes practice good (and there never will be as it is domain-specific), so that makes it difficult to examine in even a single study, much less across many studies and domains. As far as I can tell, quality of practice is not considered whatsoever. So there are potentially people showing up half-heartedly to practice, practicing something they’ve already mastered, or practicing something they aren’t ready for all getting counted the same as people who practice “optimally”, whatever that is.

Again we see that in the domains with a low variance explained by practice, practice quality is much harder to measure. In games and music a good way to practice is simply to play the game or play the music (though there are often better). Compare that to professional programming. Few people really practice once they learn the language. The quality of continued learning on the job depend on a huge number of factors. Most likely these could not be accounted for in anything but an ethnographic study (unfortunately I couldn’t track down the one study from the meta-analysis targeting professional programming).

In short this study does not tell us about the potential of practice because its measure doesn’t capture when practice is most useful. Unfortunately due to the domain dependencies of what constitutes practice threshold and quality, we’re unlikely to ever see a meta-analysis that captures the full potential of practice across domains. What it may tell us is that the common idea of practice isn’t nearly good enough, especially in something as important as professional work. If it only makes 1% difference, you aren’t doing it right.

There are many sources for ideas for better practice. Popular science works such as Moonwalking With Einstein, Practice Perfect, and The Little Book of Talent are all good places to start. The Cambridge Handbook of Expertise and Expert Performance is a collection of articles across a variety of domains showing the progress that has been made since the 1993 definition of deliberate practice.

Finally a small pitch of my own: I’m reviving my wiki to compile general thoughts on effective learning and practice as well as a glimpse of my personal efforts to practice programming and other skills. I encourage you not only check mine out but also to start something similar, and maybe we can conduct a study of super-effective learners!

The best of e-learning, an example from the farm

Originally answered on Quora.

What are some good examples of simple, succinct e-learning lessons?

Here you go, an incredibly simple and effective lesson on the sexing of day-old chicks (Biederman & Shiffrar, 1987).

Screen Shot 2014-06-04 at 6.22.07 PM

In the experiment, these instructions improve novice subjects’ correlation with experts from .2 to .8, this in a field where expertise was typically coming from years of experience.

This may not be e-learning–no adaptive learning algorithms here–but it’s all you need. The key is being able to connect a developed human strength (here, shape recognition) to a new task. In one word: pedagogy. And all you need to present this pedagogy is text and static images because the task is visual recognition of a static image (assuming you can poke around a chick’s underside without squirming). (See also Are videos the best format for online course delivery?)

Ok, I hear you–maybe you just aren’t that interested in chick sexing. How can you know what else out there is effective learning? The only valid way to evaluate learning is what Biederman & Shiffrar do in this study: compare the performance to experts. Unfortunately there isn’t enough attention on that part of it to give solid recommendations among web-based options. (But see also How can I find results about learning and education from evidence-based research?)

See also

Using DEVONthink for the first time several days ago, I got a tingly sense of being in cheat mode. I imported over 700 PDFs, 800 Evernote notes, and 1500 bookmarks. As I had before with many other tools, I faced an abundant but impenetrable collection of knowledge. When I tried its “See Also” feature, I realized DEVONthink had already established an intricate network of roadways, connecting me to past encounters with ideas and information.

I checked out one of my favorite Quora answers, “What is it like to have an understanding of very advanced mathematics?” The results in the See Also column contained some of my favorite articles on mathematical thinking that I’d collected over the years: On Proof and Progress in Mathematics, Kill Math, A Mathematician’s Lament, and Learning to Think Mathematically. Though I could have pieced together most of these, the instant access that DEVONthink provides is very powerful.

DEVONthink's See Also column relates a Quora answer to articles collected over the years (as well as my collection of Wikipedia pages).

DEVONthink’s See Also column relates a Quora answer to articles collected over the years (as well as my collection of Wikipedia pages).

When I’m learning something new, I typically need to cross reference a few different sources to get it. Learning works by observing different cases of something and then extracting the generalized concept. While the latter is handled automatically by the human brain, DEVONthink is useful for assembling multiple things in a digital environment. Likewise creativity has been described as reflecting on multiple ideas and connecting them in a new way. Again, DEVONthink, brain, profit.

In short, DEVONthink’s See Also creates an environment that empowers us to use our human strengths of recognizing similarities and differences, analogies and generalizations among multiple items.

Compare this to what I attempted before: I’d probe my memory, bookmarks, and Google searches to pull up related information, interrupting the actual processing of the information in front of me. As great as bookmark tags have always seemed, they would rarely match the intention I eventually used them for. With DEVONthink I skip the manual tagging step and get better results. It isn’t another tool to collect stuff that never gets looked at again. It’s a tool for turning an idea into a brainstorm, an article into a textbook, a painting into a museum.

Further reading:

The future of adaptive learning as an iPhone

Dan Meyer in Adaptive Learning Is An Infinite iPod That Only Plays Neil Diamond draws a line between futurists and educators. Futurists envision adaptive learning technologies that replace teachers who fail to give complete individual student attention and enforce a uniform classroom experience that abandons students who are behind and bores students who are ahead. To Meyer, this technology will necessarily lose a lot too: the richness that happens in a live, simultaneous classroom experience.

I don’t yet concede that all will be lost. My aim with this post is to understand the learning benefits of a good classroom that Meyer sees in order to provide suggestions to future software designers (whether or not they adopt the “futurist” label) to preserve and even enhance these benefits. As we will see, there’s hope for adaptive learning beyond Neil Diamond and even the infinite iPod. My model of classroom learning may be incomplete, but then I hope you’ll be able to point to what is missing and somebody (that is, me) will have learned something.

The first thing we think of in rich learning is content. Content, at a pure informational level, can largely be carried over to a digital adaptive learning system:1 record a video lecture of the teacher saying the same words, for example. The popularity of the flipped classroom attests to that.

The first design imperative is to seek out effective educational content for learning systems, then to understand how its audience responds and react accordingly (as a good teacher would).

When we say that richness comes not from the informational content but rather from the presence of a live teacher or peers, that isn’t so much richness of the content as it is of environment. This is what Meyer is referring to when he talks about classroom-based math education

…as a social process where students conjecture and argue with each other about their conjectures, where one student’s messy handwritten work offers another student a revelation about her own work, a process which by definition can’t be individualized or self-paced…

Meyer wants to preserve the liquid networks (Where Good Ideas Come From) that are peers engaged in common learning tasks. Better ways to get from a student’s current mental state A to a better-learned state B may come as flotsam from a peers who is approximately around A rather than from the teacher who is well-accustomed to B. Or from computers that lack any empathy that isn’t preprogrammed. In Dear Teachers, Khan Academy Is Not for You I talk about how the fact that Sal Khan’s perspective may, in some cases, be closer to the students’ mental states than the teachers who criticize the video.

By preprogrammed empathy, I mean that computers can respond to “errors” that it knows about, and may have an excellent approach to help the student correct that error. As computer-based learning scales, it can start to learn more than a teacher about the best directions from A to B, and it can give those directions with complete patience and without falling back to the B perspective too quickly.

This leads to what I think is the ultimate battleground for classroom versus computer learning, feedback. On one hand, a computer’s feedback can be instanteous and adapt the entire learning experience accordingly. Meanwhile the teacher will grade your paper in a week, and though she’ll realize you didn’t understand any of that stuff, there won’t be time to change the lesson plan. But can computers match the targetd and contextual feedback that humans can give?2

Feedback can take on many forms:

  1. Correctness feedback. Software that can evaluate the correctness of something can easily provide right/wrong feedback. It seems that mere correctness feedback can do a lot for learning, but that is an argument for another post. However, computers have a huge artificial intelligence barrier to cross in terms of being able to evaluate what people learn except in limited formats.

  2. Content resequencing. The next step beyond stating whether a student’s work is correct is to adapt the content in response. This can be as simple as repeating the exercise set if a threshold is not reached, as DuoLingo and Khan Academy do. But it can extend to recognizing the details of what is being missed and presenting more instructional content.

  3. Environmental affordances. Beyond the people in it, a classroom environment isn’t particularly well designed for learning. As I talk about in a comparison of learning environments with the game Portal, we can do more in a virtual environment to directly benefit learning. The environment itself can shape your understanding of errors in your thinking and paths to correct them. For example, a tall ledge dropping off in front of you affords figuring out another way to use your portal gun. This idea goes well beyond physical affordances, as I’ll talk about in an upcoming post.

  4. Dialogue. I love the quote from John Holt’s How Children Learn: “To rescue a man lost in the woods, you must get to where he is.” Another Meyer post convinces me of the power of a teacher’s response within the rich context that is the student’s own thinking. For example, a girl is solving a problem that states that 1 in 3 families own dogs and asks how many students in her class may own dogs. The student draws lines for each student in her class and underlines every third. A teacher can recognize that the student is primed to represent the problem as division and can work with the student’s current representation to do that (maybe, I’m not a teacher). That is hard for a computer.

Overall the state of computer feedback is inconclusive and presents a vast opportunity to make computers smarter both in recognizing student mental states and helping them transition to better ones. We have seen research results of adaptive learning systems providing significantly better learning, but the nature of control groups in these studies don’t necessarily imply computers are anywhere close to the best of classroom learning.

If we remain optimistic though, adaptive learning can be not just an iPod that plays any kind of music, but an iPhone where we can program it to do almost anything. This is obviously true: the iPhone and adaptive learning systems are both just computers. Better yet, I hope that a adaptive learning platform can mirror the platform of the iPhone (which includes physical convenience, UI standards, inputs like voice and camera, etc.) that support a beautiful diversity of apps. Apps here being learning experiences that are rich in environment, content, and feedback.

The better analogy is do you want an MP3 or do you want live music? As far as music goes, the world has chosen. Both!

1 Technology skeptics have some basis for distrusting what translates to a screen. Humans have to learn to learn from screens, rather than other humans. For example, infants don’t pick up a second language as readily from a multimedia program as they would from a nanny. But we do learn to use–and seem to fully embrace–digital learning. Many studies have confirmed the engagement of children with virtual entities. Try one yourself: watch a kid play a videogame.

2 There is a middle ground between pure human feedback and pure computer feedback. Computers can provide hints to the teacher about the context in which to provide individual feedback. However, current solutions are not very good, so this is yet another design challenge for educational technologists.

Observe without judgment

The highest form of human intelligence is to observe yourself without judgment.

This quote of Jiddu Krishnamurti, which I got from the book Nonviolent Communication, seems to directly contradict my post Defining “smart”, where I argue smartness is a process of judging. Is this a paradox?

Perhaps a better definition of intelligence is a two-step process. The first is to observe without judgment, and the next is to apply judgment among possible responses to the observation, invoking a quote from Hadarmard’s The Psychology of Invention in the Mathematical Field:

To invent is to choose.

Some examples:

  • From Nonviolent Communication, the context is that an intelligent communicator is able to non-judgmentally observe the feelings of oneself and others and then choose an empathetic response.
  • A typical design process is to brainstorm while deferring judgment, followed by a critical synthesizing.
  • A good way to learn to draw is being able to observe without invoking iconography (a form of judgment). As you develop as a component drawer, you become an artist by choosing what to observe and draw (perhaps “observing” from your mind’s eye).
  • A mathematician may observe a mathematical object before attempting to judge the correctness of a property.
  • The scientific method is first to observe without bias, then to judge the validity of hypotheses.

From my previous “smart” post, it’s clear I find intelligence in the act of analyzing and choosing. I believe observation is not a trivial step and can be at least as challenging.

I have experience with observing to draw. Techniques (which you can learn about in Drawing on the Right Side of the Brain) like drawing upside-down, blind contour drawing, and observing negative space require a great deal of focus and mental energy. Likewise meditation is a focus on observing your breath or body and is very challenging–one is constantly fighting off distracting and judgmental thoughts. In fact with meditation the act of observation itself can lead to healing of physical discomfort, as described from a skeptic’s perspective in Teach Us to Sit Still.

Finally, what about another possible step to intelligence: generating ideas? Isn’t the design process example about generation and creativity rather than observation? It’s subtle but I’d argue that you observe what comes to mind rather than doing generation yourself. Going back to Hadamard, he notes that mathematicians generally make breakthroughs after taking their mind away from the problem. The answer comes in a flash, and the mathematician merely observes it.

Designing learning systems with spaced repetition

Spaced repetition is a valuable technique for learning. The typical design of a spaced repetition system (SRS) presents users with a queue of all items that are due according to its scheduling algorithm1. The motivation behind this post is that the queue can quickly become overwhelming, and endless item review is frankly boring. Can we do better?


The SRS design is based several assumptions:

  1. The user wants to retain everything, all the time. SRS queues get big because they contain everything the user has thought to add to the system, whether or not they still want to know it. Sure, the user can delete things, but leaving this kind of maintenance work up to the user isn’t ideal.
  2. The user doesn’t review any item outside of your system. In a previous post, I talked about how spaced repetition occurs naturally when we attempt to learn something in a natural way. In other words, if you are actively engaged in learning Portguese in Portugal, your exposure to many words will be spaced and repeated. By assuming that users only learn in the system, you either drive them away from reviewing with more natural processes or your algorithm is based on poor assumptions. (When I was studying Chinese only in Skritter, it was spacing too far, so I’m guessing their parameters were adjusted for people who studied outside the system.)
  3. There’s only one way to review an item. In a typical SRS, all items are independent. But think about basic addition skills. Do you need to review them constantly? No, they come up all the time when you learn multiplication, division, and then any other topic involving math. In Learnstream Atomic, we attempted to break down physics questions into components and mark everything as reviewed. I think that was only the tip of the iceberg.

This sounds obvious, but one way to reconsider spaced repetition systems is to realize that they provide two values: spacing and repeating. The overwhelming queue is a design that favors repeating items more than spacing them, at least when you consider item review outside of a closed system. Imagine another system that take the opposite approach, favoring spacing: perhaps a website that has links to different items but warns you not to look at something that you’ve looked at recently.

If you’re considering implementing a spaced repetition system for a learning tool, consider carefully the assumptions made by existing systems and the two values provided by spaced repetition. What would you do differently?

[1] Every SRS I’m familiar with uses the SuperMemo algorithm, based on the idea of an exponential memory decay.

Defining “smart”

I don’t like to use the label “smart” because there are many positive ways of being human, all of which involve using your brain. Here I’ll play with a definition of it anyway:

Smartness is the ability to sort statements by their knowledge value.

A mathematician writes a proof, which is a sequence of statements, each giving an essential piece of knowledge to decide a previously-undecided truth. It isn’t about brevity: the particular writeup may contain many other statements that explain things to the audience. A compentant mathematician would be able to point out which statements are essential to the proof, the statements with the most knowledge value.

As a counterexample, imagine someone uttering, “Education needs to be disrupted.” (I’m not pointing fingers, many of us have.) The issue isn’t about correctness. We have a mutual understanding that the term “needs” is probably too strong. And the statement may be encapsulating many other thoughts about why education needs to be disrupted. But a smart person should recognize that identifying a significant component of education that can be changed within a broader context, or a method of disrupting education that works in the long run, would both have far more knowledge value.

Actually I’ll revise the definition to be more abstract: smartness is the ability to sort objects by some property value. Smart photographers can sort pictures by their emotional value. Smart comedians can sort jokes by their comic value. In an Esquire interivew, Woody Allen says:

I don’t think of the joke and then say it. I say it and then realize what I’ve said. And I laugh at it, because I’m hearing it for the first time myself.

Taken to the extreme, this defintion suggests that creative endeavors are more about perceiving value than producing it. My belief, which I won’t justify here, is that this is true, though production also requires well-practiced techniques. For example a basketball player needs to have well-practiced shooting techniques but also (more “smartly”) needs to be able to perceive the value of taking a shot in the current situation.

Getting beyond massively lousy online courses

Sebastian Thrun on Udacity:

We have a lousy product.

In the article, Thrun says that MOOCs, massive open online courses that gained popularity a couple years ago when introduced by professors from Stanford University, didn’t live up to their hype in democratizing education for the whole world.

Personally I’d been anticipating the start of a particular MOOC for several months–there isn’t very much educationally-oriented material on the topic in existence. Recently, on the week it finally came out, I finished Portal 2 instead of the first assignment, which involved installing, troubleshooting, and navigating a complex program and hunting down the dataset within the MOOC software–all before the deadline.

Ain’t nobody got time for that.

What can MOOCs learn from Portal 2 about making a compelling product? Let’s take a look.

Why am I playing this game at all? Plot. I’m stuck in a dystopian science facility being avenged by the evil computer system GLaDOS. The startling setting and crazy characters immediately draw me in.

Each level in Portal 2 has a clear goal: open the door. Generally I need to learn one new thing to complete the level while integrating what I’ve learned before, providing incremental difficulty. Furthermore, the environment that you interact with has many affordances, guiding you to play with tools like blocks, buttons, and magical scientific bouncy goo.


Even if I’ve discovered the tools to use, it takes some trial to succeed in the level. The game provides feedback when something isn’t working right: I fall into a pit and drown in toxic water instead of reaching the other ledge when I haven’t figured out how to jump far enough.

Progress is concrete: I finish a level in about 10 minutes. Further, I receive a reward at the end in the form of taunting from GLaDOS that’s genuinely funny as I ride the elevator to the next level.

Compelling plots

The “why?” of a MOOC is usually confined to the professor droning on a few minutes during the first lecture giving a list of ways the subject has been applied. There’s lots to say about storytelling, but there’s a reason that “vague list” isn’t a story archetype. Plots are, partly, about fantasy–we can put the learner in the applications and make it big and dramatic. Language learning? Take me to a foreign land. Applied math? Let me be that guy from Numb3rs. At least in college, I was a student on a four-year quest for a degree with my classmates. In a MOOC, I’m just a registered user who gets a lot of annoying emails.

Online learning has yet to go very far with this idea. One example is Codecademy, where you at least have a larger objective of completing a project.

{<2>}Codecademy's final JavaScript lesson is framed as replacing a broken cash register

Clear goals

MOOCs often ask you to complete a complex task in a complex environment. You need to switch back and forth between the software and slides for step-by-step instructions, and you don’t even understand what you’ve achieved at the end.

DragonBox teaches algebra using the principle of clear goals. Each level has the same goal of isolating the spiral, but they incrementally teach all aspects of solving algebraic equations.

{<9>}DragonBox has a clear goal: isolate the spiral (grounding the idea of 'solve for x')

Incremental difficulty

Professors seem to love to jump into applied knowledge. Before making sure you get the definition of something, they’re asking you to transform and apply it.

{<11>}DuoLingo highlights the one new word introduced in this problem

In contrast, DuoLingo succeeds in incremental difficulty: it typically presents one new word at a time.


Check out Quill: it presents a textbox claiming “There are nine errors in this passage. To edit a word, click on it and re-type it.” I have no desire to learn anything more about grammar, yet I corrected several errors during my first visit to the page. The textbox, the existence of errors, and even the typography and the way individual words are selected when clicking, all afford me to play with it.

{<3>}Quill's interface affords testing your knowledge of correct writing

While it’s true that multiple choice prompts common on MOOCs are an affordance for providing an answer, these are generally removed from the environment and tools you’d actually be working with.


One of my major takeaways from interviewing many users of online learning systems is that the loop of instruction, practice, and feedback is way too long. Imagine that I watch several hours of video lecture over the course of a couple days, then I come back another day to do the assignment. Of course there are key ideas in the lecture I didn’t understand or remember, so I have to go hunt them down within those hours of video. Of the dozens of concepts covered in the videos, I get about 10 questions worth of practice on the quiz. Finally, I might not even receive immediate feedback on that quiz–I have to wait until after the quiz deadline to see what I missed anything and understand why. If I even come back to look it.

Based on Bret Victor’s principle that creators should immediately see the effects of their changes, Khan Academy’s computer programming environment allows you to adjust variables in the code and see the results on screen.

{<5>}Khan Academy CS lets you adjust numerical input values and instantly see the result

In other words, you get feedback as you adjust the code. However, this feature is only responding on one very minor aspect of programming. Imagine an environment that gives feedback about a misunderstanding of conditionals or recursive, and then we’re getting somewhere. Indeed Victor responded with an article about how they got it all wrong. You should read it.

Meaningful rewards

In the Power of Habits, Charles Duhigg explains that concluding an interaction with a reward is a powerful way to instill habits. The trend of gamification has driven this effect through badges and points. But as Portal 2 shows, rewards are an opportunity to entertain and drive the plot forward, not just pad pockets with a fake currency.

CodeCombat (disclosure: friends with one of the founders) is a new effort to teach programming that uses this idea well. Once you’ve successfully programmed your soldier, you get to watch him execute his program and kill the ogre. You also get to see the “spells” that you learned in that level. It’s like collecting badges but also uses the opportunity to allow you to reflect on what you’ve just learned.

{<4>}CodeCombat displays your code execution as your character defeating the ogre

Final thoughts

Some of these principles apply to developing better tools for us to do our work. If a tool is already well designed, learning it is easier. However, it is still important to understand the learner’s state, that is differences between what different users already know and understand. Considering the learner’s state implies we should set goals of incremental difficulty and indicate and reward when those goals are achieved, just as good games put sequence levels with clear goals in incremental difficulty for the player.

There’s plenty more to consider for an ideal learning environment. I’ve written before about spaced repetition, mnemonics, and multimedia. But I believe that solid execution on these principles gets us 80% of the way there. As Sebastian Thrun’s resignation demonstrates, we have a very difficult job ahead in that.