The Silicon Valley vision of education

In a Tim Ferriss Show podcast episode, Peter Diamandis, entrepreneur extraordinaire, answers a listener question: How can we disrupt our education system? I think it’s articulate and representative of the typical “Silicon Valley Vision” for education, so let’s dig into it.

First of all, education’s got a couple different parts. There’s the part of socialization, of getting to know kids, getting to know people, how to be a good citizen, how to interact with people socially. Then there’s the part about learning.

I will stick to the “learning” part, as much as that division is legitimate.

And the challenge with our education system, and you know this, we all know this, is, it is 150 or 200 years old. And it just sucks. I don’t know how else to put it.

I’m not here to talk history either, but I recommend The Invented History of ‘The Factory Model of Education’ to get a richer perspective on the “education is old and broken” talking point.

In any classroom, half the class is bored, the other half of the class is lost, and even the best teachers can only teach to the median. As classroom sizes grow, our ability to provide personalized educations just isn’t happening. So for me, the ability to scale is the use of technology.

I agree with this critique of classroom learning in general. Tutoring, on the other hand, is something like a gold standard in the research community ever since Benjamin Bloom’s 1984 study that tutored students performed at the 98% percentile level(!) of a control group (Bloom’s 2 Sigma Problem). I don’t believe the 98% has quite held up in replication, but I do have a strong belief in the power of personalization.

For better or worse I’m going to base my position on an analogy to medicine. Like the illnesses we see a doctor to treat, the misconceptions, lack of knowledge, or motivational breakdowns that hinder our academic performance are issues in the realm of teachers and schools. At least both occur mostly within our fleshy membrane.

Just like we wouldn’t want to be treated for an illness in a room of dozens of our peers, we would likely benefit from a masterful teacher that could work individually to diagnose our missteps and provide the right “treatment” (maybe an item of knowledge, but perhaps a motivating example, practice maneuver, or perceptual cue) to advance our learning.

You may or may not agree that this is a more desirable state but I think we can all agree that we (the American public school system, or any system of K-12 education) don’t have the resources for anything like this — enough individual attention for all students to learn all the standard curriculum.

The Silicon Valley Vision is that technology-based education can provide education that is not only better than one-on-one human teachers, but can also scale to accommodate every student, up to and including, yes, the poor African villager.

Big goals.

I always ask the question, how do you dematerialize, demonetize, and democratize different systems. In the case of education what I believe is going to happen is that we’re going to develop artificial intelligence systems, AIs, that are using the very best teaching techniques.

Let’s establish some common ground.

First, it’s not clear to me what it means that the AI is “using teaching techniques. Is the AI selecting and sequencing some pre-existing content, or is it actually constructing pedagogic material and enacting the delivery on its own (whether through generated text or Siri voice or even a robot)? The former is more realistic in the near term — for example, it’s the role that Knewton plays for the content of publishers it works with — and seems hinted at by later answers, so let’s stick with that.

Next, I don’t know how these “best” teaching techniques are determined. If these techniques are known, what has stopped us from applying them already?

I’ll give the Silicon Valley Vision the benefit of the doubt here: the “best teaching technique” is highly context dependent, and except perhaps for our imagined individualized 2-sigma teacher, the only practical way to map from context to technique at scale is with automated technology. That leaves us with one question: can technology do that?

An AI can understand a child’s language abilities, their experience, their cognitive capabilities, where they’ve grown up, even know what their experiences are through the days, and give that individual an education that is so personalized and so perfect for their needs in that moment that you couldn’t buy it.

Diamandis starts by enumerating of these contexts for personalization. In our medical analogy this would be like asking for a piece of software we switch on that tells us everything that could be wrong with us. Instead, we have countless scans, tests, and measurements that give hints at what could be going on. Is there reason to believe that the mind is more scrutable? I haven’t seen one.

Our state of the art in learning “diagnostics” is to hand-code the units of knowledge for a particular domain, ask tons of assessment questions, and infer a small amount of information of from each of these about how likely the student knows of each of the units of knowledge. For a typical case, a multiple choice question, the information content is very low — there’s already a 25% chance the student just guessed the right answer — for maybe a minute of the student’s time. That isn’t nearly the information bandwidth that a good teacher achieves, even working with a large class. (Don’t get me wrong, there is cool work that is building domain and student modeling in environments like games or inquiry learning, but the point is that this progress is incredibly slow — for example, a block stacking game that has been individually designed, programmed, and modeled over several years.)

And the beautiful thing about computers and AI is that it can scale at minimum incremental cost. So you can imagine a world in the future in which the son or daughter of a billionaire, or the son or daughter of a poor African villager, have equal access to the best education. We’re seeing that today in knowledge, right, because Larry Page, founder of Google, has access to the same knowledge and information that the poorest person on Google has. It’s a flattening of this capability.

Let’s ignore the issues of access to technology for now, that is, assume our villager does have internet access (uncensored and not prohibitively slow). Do they choose to access the knowledge? When they access the knowledge, do they have the background to understand it, or the means to put the knowledge into action? Sometimes, yes, and the whole project may be worth it for those cases, but when we’re talking about education being solved and done for everyone, there is no precedent here.

So AI for me is the answer to global dematerialized, demonetized, and democratized education. We have to separate learning things from actually socialization and being inspired and so forth. Humans are going to be part of that — always will be — but AI is going to be the way that I learn something. Or an AI can really deliver the information in a way that’s compelling and meaningful. In fact we’re going to have a situation where an AI may be watching my pupilary dilation or how I tilt my head or asking me questions to really understand, did I understand that concept, or was I just faking it by nodding my head. I mean how many times are you speaking to someone and they’re trying to teach you something and you say, “Yeah yeah yeah”, and really in the back of your mind you’re going, “I have no idea what this person just said.” I think education driven by neuroscience and by artificial intelligence will know that you didn’t get it, will back up to the point where you lost the idea, and then bring you step by step so you really do learn these things.

By now our picture in the medical world is rather comical. Imagine an personalized medicine system that, upon checking your vitals and determining the effects of the medication aren’t taking hold, retracts its robotic arm, refills the syringe, and injects you again, over and over, hoping one of these times will work.

If this AI vision doesn’t just mean repeating the instruction at the point of (detected) failure, then is there a map from the context that technology could infer to something “more meaningful” for the student? That’s a challenge for a fully empathetic human who knows the life story of one of their students. Well beyond Turing test level.

I think we’re really going to transform education very quickly. And it’s a huge and critically important part of our society, so as the father of two four-year-olds, I am personally passionate and excited about solving that challenge.

The language of “solving that challenge” sums up what’s most flawed in the Silicon Valley vision of education. There is no “education solved” checkbox. To the extent such a solution is envisioned, it is well beyond the grasp of the foreseeable future in the science of human learning or existing AI-driven technology in the field.

I do think there are tremendous opportunities for technology in education. If our goal is to provide a better personalized education, that means we need to be better at diagnosing and treating deficiencies in knowledge and skills. Just as there has been no disruption of medicine by the use of technology, there won’t be for education. But we can get better practice by practice, and tool by tool.

How We Learn: Learning Without Thinking

I’m enjoying How We Learn for tying together quite of bit of what I learned during my year in grad school. The effects of spacing (chapter 4), testing (chapter 5), and interleaving (chapter 8, covered earlier) are powerful for learning, but we know a reasonable way to implement all of them: throw everything you want to learn into a spaced repetition system. What’s been most exciting is chapter 9, Learning Without Thinking, which covers perceptual learning.

School education is skewed to verbal and symbolic learning: tests require you to explain your answer or work out steps of math. Perceptional learning changes the focus to visual information. I’ve covered perceptual learning previously in the rather obscure realms of Stepmania and chick sexing, but it applies to almost anything. To see how powerful perception as a component of domain expertise, consider chess. Quoting Carey:

On a good day, a chess grand master can defeat the world’s most advanced supercomputer, and this is no small thing. Every second, the computer can consider more than 200 million possible moves, and draw on a vast array of strategies developed by leading scientists and players. By contrast, a human player–even a grand master–considers about four move sequences per turn in any depth, playing out the likely series of parries and countermoves to follow. That’s four per turn, not per second. Depending on the amount of time allotted for each turn, the computer might search one billion more possibilities than its human opponents. And still, the grand master often wins. How?

He quotes a sketch of an answer from Chase and Simon’s 1973 study of perception in chess, “The superior performance of stronger players derives from the ability of those players to encode the position into larger perceptual chunks, each consisting of a familiar configuration of pieces.”

What does that mean? We don’t have a verbal or symbolic understanding of this ability, eluding the primary mode of computers, education, and–unfortunate for me–blog posts. We see the visual information of the board, and it activates different sizes of “chunks” in our mind. These chunks perhaps roughly correspond to levels of abstraction. A small chunk is that there is a black pawn on g4. A little larger is seeing the king in check. A big, powerful, supercomputer-beating chunk is some kind of dominant offensive pattern that is observed by white’s combination of positions across the board.

…And how do we learn these chunks–in a way that hasn’t translated to the performance and algorithmic sophistication of computer systems? I think we’re still in the early stages of understanding that, but the next stop on my reading list is papers from the Human Perception Lab.

Miracles through empathy and persistence

For me the first principle of teaching, using John Holt’s metaphor from How Children Fail: “To rescue a man lost in the woods, you must get to where he is.”

I’ve been hearing many stories about very nontraditional “students” who seem lost beyond hope. The Radiolab episode “Juicervose” (covering a story I first heard about from NYT), tells about how an autistic boy used Disney movies to start communicating with his family. After endless watching of movie after movie, repeated time after time, the boy finds the first phrase to reach out. Once his father figures out what’s going on, he takes the role of a Disney character to start really speaking with his son for the first time in years.

Some other examples (for some reason all podcasts): from the same episode, parents spend 900 hours imitating the self-stimulating behaviors of their autistic child before achieving eye contact. In Radiolab’s “Hello”, a woman lives with a dolphin in order to teach it to talk. In This American Life’s “Magic Words”, a couple use improv to speak to their mother who suffers from dementia. In Invisibilia’s “The Secret History of Thoughts”, a boy in a vegetative state is cared for everyday by his father until things start to turn around (this one is a must listen).

In all these case, the lost man is very deep in the woods indeed. For a while, it looks to the searchers like all of the walking in the woods is getting nowhere. They call out his name for the hundredth or the thousandth time, and this time, finally, there’s a response.

I think the principle applies not just to teaching but to self-learning as well. As learners, we must be mindful of where resources assume we are currently in the process. When we practice skills, we must have enormous patience and allow ourselves to slowly work our way forward from wherever we happen to start (instead of comparing ourselves to others).

How We Learn: Being mixed up

Chapter 8 of How We Learn describes interleaving as a better means of practice. In math education the Saxon textbook is an example. It uses a mix of practice problems, combining everything learned so far, as opposed to typical textbooks where all problems are about one lesson. Not only does this better improve the skills being learned, but students now need to recognize which strategy to use for each problem, and (perhaps as a result) they tend to better apply the skills in other contexts.

It reminds me of my experience with high school math team.  We’d do tests from past years of competitions: 25 questions on a variety of topics. Since these were graded more by participation than percent correct, each person could grow at their own pace. I was able to get through the bulk of tests quickly and spend some time deeply thinking about questions beyond my capacity. One that I always remember is discovering my own approach to trigonometry identities that involved manipulating triangles.

We’d do these tests in the morning and then have people put them up on the board in the afternoon. (Yes, we had two periods of math team plus the regular math class.) I felt this was another benefit – someone around your level could explain a problem as they may have figured it out for the first time. And I would try and fail to explain my homebrew approach to trig.
The chapter also contrasts the conservative and progressive approaches to math education. The progressive supposedly favors conceptual skills like number sense while the conservative builds up from concrete, procedural skills. (I also recommend commentary from Math With Bad Drawings)

The method of learning for math team was strongly in the conservative tradition. We even had a “formula book” that contained formulas that could solve probably 80-90% of the test without much further thought. Sometimes deeper thinking did happen – not just when I will bored and unprepared for trigonometry but when inspired by good question writing that demanded using the material in new ways. (One competition that I highly commend is Mandelbrot.) I don’t think these kind of questions could have been approached without that base of knowledge. Of course the balance is hard to strike: by senior year, some of us were pushing back against being taught with such focus on these more formulaic problems.