Generative AI tools built on large language models are increasingly "intelligent" yet lack a baby's common sense – the ability to non-verbally generalize to novel situations without additional training. What can developmental science contribute to AI? Tech journalist and former CASBS fellow John Markoff chats with 2023-24 CASBS fellow David Moore, a developmental scientist with expertise in infant cognition, on evaluating the efforts of DARPA's Machine Common Sense program as well as prospects and concerns associated with creating AIs with common sense.
Generative AI tools built on large language models are increasingly "intelligent" yet lack a baby's common sense – the ability to non-verbally generalize to novel situations without additional training. What can developmental science contribute to AI? Tech journalist and former CASBS fellow John Markoff chats with 2023-24 CASBS fellow David Moore, a developmental scientist with expertise in infant cognition, on evaluating the efforts of DARPA's Machine Common Sense program as well as prospects and concerns associated with creating AIs with common sense.
DAVID MOORE: Personal website | Claremont Infant Study Center | Wikipedia page |
DARPA Machine Common Sense program
Related resource:
David Moore, et al. "Leveraging Developmental Psychology to Evaluate Artificial Intelligence," 2022 IEEE International Conference on Development and Learning (ICDL), Nov. 2022. DOI: 10.1109/ICDL53763.2022.9962183
Recommended by David Moore:
Esther Thelen and Linda B. Smith. A Dynamic Systems Approach to the Development of Cognition and Action. MIT Press, 1994.
Read John Markoff's latest book, Whole Earth: The Many Lives of Stewart Brand (Penguin Random House, 2022)
Narrator: From the Center for Advanced Study in the Behavioral Sciences at Stanford University, this is Human Centered.
AI models can generate text, imagery, and videos that are nearly indistinguishable from those produced by humans. And even while we've seen recent iterations of AI models become even more powerful by the week, they still lack what we think of as common sense. They're unable to generalize to novel situations without additional training, an ability which human infants possess even before they can use language.
It's no surprise then that a small number of computer scientists are asking, what if AI could learn like an infant? But how exactly do infants learn? And how would we translate those processes to building better AI models?
Today on Human-Centered, a conversation with 2023-24 CASBUS fellow David Moore, a psychology professor emeritus at Pitzer College and a developmental scientist with expertise in infant cognition. Among other things, Moore has served as Director of the National Science Foundation Developmental Sciences Program, as a Director and Founder of the Claremont Infant Study Center, and is a fellow of the American Psychological Association. From 2018-24, Moore served on an evaluating team for the Machine Common Sense program sponsored by the US.
Defense Advanced Research Projects Agency, also known as DARPA. His experience with the part of the program focused on nonverbal intelligence informs much of today's episode. Joining David Moore in conversation is former New York Times tech journalist and Pulitzer Prize winner John Markoff, a 2017-18 CASBS fellow and an occasional guest host here on Human Centered.
As you're about to hear, the two explore the fuzziness of defining intelligence and common sense, the differences in philosophies and approaches among teams participating in the Machine Common Sense program, how the program would be different if launched in 2025, larger concerns about creating AIs with common sense and the rarity of using developmental approaches in the field, and the prospect of replicating the kind of scaffolding that a human adult contributes to a child's development. So what exactly can developmental science bring to AI? Are we moving toward an AI with a baby's common sense?
Let's find out.
John Markoff: Why don't we start by my asking you to describe Darpa's machine common sense program and your role in it just generally. What was Darpa trying to do? And I'd like an update too.
David Moore: Darpa was trying to get developmental scientists into the room with computer scientists because so often the work we do is siloed. And they realized that one of the reasons machines back in 2018, when the program began, didn't have much common sense was because machines have no way to develop common sense the way children do. And so the thought was get experts in child development to talk with the people coding up AIs and see if they could help improve its common sense.
John Markoff: Yeah. And has the project now concluded?
David Moore: It has. It ended in February of this year, 2024.
John Markoff: Oh! Recently? Yeah. Okay. And so maybe sort of step back to 20,000 feet and tell me what was learned and where you think we are now.
David Moore: I think different people would tell you different things were learned. And there were people who I think would say the program was a failure on the grounds that the machines did not wind up with a clear-cut common sense of the variety that you see in young children. I think other people thought that the program was a success.
You had asked what my role was, and I was on what was known as the evaluation team. There were basically four teams in the branch of the program that I was involved in, and three of them were, you could think of it for all intents and purposes, as if they were coding artificial babies. And my job with the evaluation team, the fourth team, was to code a artificial world in which the artificial babies could get tested.
And as a psychologist, I'm not a coder, so we had, like in every team, there were developmental scientists and computer scientists working together, and that was true of the evaluation team as well. So we would give suggestions to coders who would create the artificial worlds in which these babies were tested. And our job as the psychologists on the evaluation team was to come up with experiments, kind of like the experiments we use to test real babies.
I would say having described now the evaluation team, I think we all thought that the program was successful because we developed evaluations that were very different than any other AI evaluations that had been developed up to that time. And the DARPA, honestly to my surprise, the DARPA people seemed to think that the work that we had done was at least as important as the work that was being done on the AI side because they realized that there was value to very rigorous, experimentally controlled evaluations. And by and large, the AI community has not been doing it that way.
John Markoff: Over the course of, you know, let's go back into the 2015 to 2018 period when the program started, what's our understanding of what common sense means in just a basic, did it evolve as a result of this project or has it evolved during the course of the project? Common sense understanding?
David Moore: I Yeah. So that's a very difficult problem. And it was a problem we were faced with right out of the gate. And there was a lot of disagreement and we went back and forth. And I would say that across the four to five years of the program, there might have been evolution in the case of individuals who were working on the program.
But I don't know that there was ever an agreed upon idea that changed in a dramatic way. Instead what happened was we all had different ideas but agreed on certain basics. And one of the basic things we agreed on was that common sense requires the ability to generalize to novel situations without any additional training. And that remained a core idea for us.
And so common sense is a Beganoff concept that there were a lot of different ideas that people had about what constitutes common sense. And individuals' ideas about those fringe parts of the concept might have changed over the period. But the one thing we all continued to agree on was that it required generalization.
John Markoff: OK. And then let's jump to this other concept of intelligence, which is right at the heart of your world. Is there still that kind of spectrum of opinion on what intelligence is? Or is there any more general agreement?
David Moore: I don't think so. I continue to think that there's a lot of disagreements. Psychologists are of different minds about this.
I'm a developmentalist. And so I come at this question from a slightly different perspective than, for instance, a cognitive psychologist. And I think there are cognitive psychologists who will tell you we do know what intelligence is.
But I think other kinds of psychologists might come in and say, well, wait a minute, there's this whole other approach. Cognitive psychologists might dismiss those. But altogether, I think there continues to be disagreement about how to define intelligence.
John Markoff: And where are you with, I think within the AI community, there's this general notion that there are different kinds of intelligence to make room for something called machine intelligence, that that is its own thing. And does that workable kind of schema to you?
David Moore: Yeah, I think so. For at least 40 years, we've had this idea of multiple intelligences, which we can trace in a variety of directions. But I tend to think of Howard Gardner, who was at the Harvard School of Education back in the 80s.
And so we've had this idea that human beings have multiple kinds of intelligences. This is a totally different kind of intelligence, but I think it's fair to characterize what has happened since the arrival of ChatGPT as a kind of machine intelligence. The AI community is really split on this, whether or not this is simply a tool, in which case we don't care how much it looks like human intelligence.
As a matter of fact, the less it looks like human intelligence, the better, right? Because it's smarter than we are and doesn't have the problems. But if you want something with, quote unquote, common sense, then presumably you need something that's a little more human-like.
John Markoff: Were you guys… when the machine common sense project started in 2018, were you cognizant of all the work that Doug Lenat had done going back? I mean, this traces us back to Stanford. He was a student of Ed Feigenbaum's.
I think you've mentioned Moravec's and McCarthy's easy and hard conception of challenges in the field of problem solving. Was Lenat’s work a framing or a foundational basis, or did you start from scratch?
David Moore: Neither of those. And the reason is because Lenat was on the program.
John Markoff: Oh, still going.
David Moore: Yeah, he was. He's no longer with us.
John Markoff: That's right. That's right.
David Moore: Yeah. But he was. And when I mentioned that there were four teams, three that were making the babies, one that was an evaluation team, that was just my branch of the program. There was a whole other branch of the program that was focused on natural language processing. And Doug's part was related to that. And so there was a part of the MCS program that was very much built on the kind of work that he'd been doing for decades.
But the part that I was involved in was more focused on nonverbal intelligence. And that is because everyone who was involved in that branch of the program were, if they were psychologists, they were infancy researchers. So the reason I was involved was because I've always worked with babies between like six and nine months of age.
These are organisms, if you're comfortable calling them that, babies, that don't speak. And so their intelligence has to be non-linguistic, but they're still intelligent because they're functioning in the world in ways that allow them to survive, provided they have parental help. And so our goal was to try to get this kind of non-verbal physical intelligence into machines. And Doug Lenat’s work really wasn't all that relevant to that.
John Markoff: So you've done important work in this sort of spectrum of nature versus nurture interpretation of this. And in listening to your talk to the CASBIS fellows, I took away something where I got the sense that you felt that there might be some things that are innate. And you were describing what human infants do at one point. And I think you said that maybe they had some conception of physics. And I began to think about, do we come innately into the world with an understanding of physics? Is that possible?
David Moore: That depends on how you define innate, it turns out. And that turns out to be a very complex topic. So when some people say innate, they mean you're born with it. When some people say it's innate, they mean it's genetically determined. When some people say innate, they mean it's inevitable. So for instance, I was not born with a beard, but some people would say it's innate because I was bound to have one.
And I would say that there are some data that suggests that babies, certainly by the age of six months, have some understanding of how the physics of the world works. And so some people would call that innate. I actually don't.
I don't like to use that word partly because it means different things to different people and partly because I am convinced that if you call a characteristic innate, it effectively short circuits developmental analysis of the origins of that trait. So if you think it's innate, we don't need to study where it comes from. It's just sort of given.
But I think as developmental scientists, our job should be always figure out how it is that something develops. Because when you are an egg that has just been fertilized, you don't have any real characteristics at all. And so all of our characteristics need to develop somehow. And I think our job as developmental scientists is to understand that process.
John Markoff: Okay. So that raises some interesting parallels. What I didn't understand from your talk completely, I saw sort of behavioral examples of how the different babies, synthetic babies, if you will, behaved in your environment. And they were remarkably different in interesting ways. The straight line, the sort of random walk, the scissors. And that was all I knew about the programs.
But could you describe, were these things just neural nets? Or were they, you know, were they, was there some symbolic reasoning that was going on too? Or what was programmatically happening in those synthetic creatures?
David Moore: I would love to know, but I can't actually tell you.
John Markoff: You were on the other side of the wall.
David Moore: Exactly. Yeah, as the evaluation team, we were not really privy to the tools that the other teams were using to create their artificial babies.
John Markoff: Okay. So, but it was interesting… fom what I saw, they had very different strategies that were embedded in them, however they were there.
David Moore: They did. And I can't tell you about the coding, but I can tell you a little bit about the theoretical orientations of the teams. So the team from Harvard, MIT, and IBM, they had what I would call a little bit more of a nativist orientation, which is to say they were informed by these ideas that babies are born with certain characteristics.
And so I think they were trying to build certain things into their artificial babies. In contrast, the team from Berkeley, which was also working with a group at the University of Michigan, they were much more interactionist. They had a theoretical orientation that bought into the notion that we develop the common sense we do through our interactions with the world.
And the team from Oregon State, which was also working with some people at NYU and Utah, they had a similarly interactionist perspective, but they were much more focused on motor development. So they believe that the things that you do in the world dramatically influence your cognitive development. And so as a result, they were doing a whole lot of work with robots that were not being done with the other teams.
John Markoff: So this is sort of a slightly personal anecdote. My mom was a special education teacher, and she was trained in the Piaget framework. And I've always been struck by one thing that she said to me.
She spent three decades working at these Palo Alto schools. And one of the things she said to me was that by the time the kids got to her in the first grade, it was too late, that they had not picked up the language skills because of class and other kinds of reasons. And I've always thought about that.
And in 2010 or so, I ran into a researcher at UC San Diego by the name of Javier Movillan. I was wondering if you were aware of his work at all. So he was trying to use robots to give infants language. And the idea was that you could use a robot where family, I come from an environment where we had a very rich interaction early on. If you didn't, you wouldn't get those skills. He was trying to use the robots to substitute. He was doing this long before language models. And I began to think, wow, wouldn't it be interesting to run that experiment now when you have, you know, much more powerful. Is this stuff being done at all?
David Moore: So I don't know. But I can tell you that people have tried to use various kinds of non-human technologies through the decades to help with language development, and to the best of my knowledge, they have failed. Babies don't really learn language from television, from movies.
They really seem to need the actual kinds of human interactions that normally give rise to language in order to develop it. So if I had to go out on a limb and make a prediction, my guess is that the robots are not going to be all that good at bringing about normal language development in a baby. But the field is changing very fast and the robots are getting better and better. So who knows what might happen, right?
John Markoff: Yeah. Ok, so it's an open field still.
David Moore: Yeah. Let me address your Piagetian question, though, because I found myself responding to your anecdote in two ways simultaneously. On the one hand, I am committed to the idea that it's really never too late. We remain very plastic in our brains well into adulthood. Even as an older person, I'm still learning new things, and it's harder for me than it used to be. But I am convinced that it's not never too late.
So in that sense, I don't think your mom was right. On the other hand, I think she was absolutely right that first grade is too late in the sense of you've missed a lot of what happens in those first six years. And I've come to think of development as a little bit like building a building.
And if you don't have a good foundation, the building that you're going to get on top of it is going to get less, it's going to be less stable than a building on top of a stable foundation. And so if you don't get good language skills in those first few years, you're going to be at a disadvantage. But it doesn't mean that you're out of luck, right?
You just might need a specific sort of experience for a more extended period of time later into childhood and adolescence in order to get good at language, for instance.
John Markoff: The project happened in 2018. Here it is now six years later. We've had six years of experience with language models. There was something that I just heard about building common sense that suggested you might be able to extract common sense understanding from language models. Have things changed at all? Would you do things differently if you started now as opposed to six years ago?
David Moore: Absolutely. So there were these two branches, the non-linguistic part that was trying to effectively create artificial babies, and then there was the natural language part of the program. The natural language part of the program, it was almost as if a bomb went off in the middle of the program, because as soon as the OpenAI LLM hit the internet, the work that they were doing changed dramatically, because they were trying to solve the very problems that it suddenly appeared had been solved.
So yeah, if you were starting the program today, you might not have that part of the program at all. There were some of us who were concerned that the funding for that part of the program was going to be cut because it seemed like it was obsolete, like the natural language problems had all been solved. It turns out that they hadn't, and so the funding for that part of the program continued.
But over on the infancy side, things were very different. We were not all that affected by the arrival of the LLMs, because we were working with a non-linguistic kind of intelligence. I would argue that the common sense that machines were lacking back in 2018, was of both the linguistic and the non-linguistic variety.
Now I think, I have to tell you personally, I am kind of stunned at the ability of the LLMs to generate commonsensical responses to verbal prompts. They're really good. And even though I certainly don't think they're sentient or actually understand anything, I think they give commonsensical responses in many cases, maybe even most cases. But over on the non-verbal side, I think there's still a long way to go. I think they don't really function well in the world yet.
John Markoff: So questions in two directions. Where would you put yourself on the spectrum in terms of the state of language models? I mean, there's Emily Bender at the University of Washington who refers to stochastic parrots and suggests this is all pattern recognition. And then go to Microsoft, for example, who was, you know, what, six years ago they were seeing they are saying they already saw sparks of AGI. Do you tend toward either end of that spectrum?
David Moore: I tend more toward the stochastic parrot side. But I am struck by how effectively a stochastic parrot can pass the Turing test. And look for all intents and purposes as if it actually understands things. You know, at some point an automaton can start to look real and fool people who don't know better.
John Markoff: Well, yes, as a developmental psychologist, this notion, this problem of anthropomorphism, are we particularly attuned to basically, I mean, my sense is that we'll have anthropomorphic interactions with any kind of object in the world. There's something about humans that wants to interact with things in a human-like, I mean, I don't know this, it's just, this is just my hunch.
David Moore: Yeah, I see that all the time, in both children and adults, people interacting with inanimate objects as if they have thoughts and feelings. People talk to their stuffed animals, people name their automobiles, they believe they have personalities. Yeah.
John Markoff: It dramatically lowers the bar for the Turing test, is my sense. That's right. The bar is artificially low.
And well, let's move into sort of a sociological question. So here we are in this world where more and more of our interactions are being mediated by machines. And now that we've solved the problem of language, it's the new interface. And how does that change society? I mean, have you begun to think about the way as these things filter into society that the world may change?
David Moore: I have a little bit, not the way I know some people have because I'm not a sociologist. And I do believe most of what I've read, which suggests that it's going to have dramatic influences across a wide variety of spheres. The primary way that I've thought about it is in terms of danger because I've been concerned since the beginning of this project.
I actually talked to my dad who was in his 80s at the time and said, should I get involved in this project? I'm a little bit afraid of like, what if we successfully create a machine that has common sense and is potentially a danger to us? And he encouraged me to do it. I'm glad he did because by the time the program had ended, I felt confident we had not created anything with common sense. So I started feeling like, okay, I didn't bring any evil into the world. But I can tell you that for a very long time, I thought, as long as the machines are not intelligent, we probably don't need to worry too much because they're just not all that bright.
I have since changed my tune on that. And I now think that if people think the machines are intelligent, we're in trouble, even if they're not. And I think people already think the machines are intelligent.
So I think there is a risk that people will start to trust these systems. In fact, we know that they already have in certain domains and that there have been people who have been hurt by the use of these kinds of algorithms. I think it's important that we be able to trust our machines. And I think we can't yet, but I think people think we can.
John Markoff: One of the things that, the way humans learn, if I put my hand on a hot stove, that's the last time I'm going to do that. I guess that's called single-shot learning. Where are we with that kind of learning? And is there any, I mean, I guess, Josh Tenenbaum began doing research on it. I haven't tracked that. And is there meaningful progress in machines doing single-shot learning?
David Moore: To the best of my knowledge, that is still an outstanding problem that no one has made much progress on. I think Josh is certainly to be credited for recognizing that that's something that people do that machines don't do yet and working toward a solution to that problem. But to the best of my knowledge, machines are still not good at one-shot learning.
John Markoff: And is there, in the human learning community, is there any good understanding of what's going on that we learn so quickly from?
David Moore: Well, I think that most people recognize that we have an evolutionary past that has given us both genes and environments that together give us receptors that allow us to detect pain, for instance. And that as a result, we're able to learn exceptionally quickly not to touch the hot stove. Yeah. In the same way that we are born with the ability to breathe, we're born with the ability to detect dangerous stimuli and to shy away from them.
John Markoff: You know, DeepMind has done such great work with protein folding, and stuff is now beginning to sort of pour out of that. You know, could you take DeepMind-like tools and apply it to genetic information or epigenetic information and get the kind of extract? Is anybody doing that kind of work with the human genome?
David Moore: I am not aware of people doing that kind of work, but I strongly suspect that people are doing that kind of work, because at this point, everybody recognizes how powerful artificial systems like this are, and I'm sure that there are people who are putting those kinds of tools to use in the domain of genetics and epigenetics. But there's a lot to still understand about both what's happening in molecular biology at the genetic and at the epigenetic level. So I think it's going to be a while.
Protein folding is as complex as it is, and as much trouble as we had solving it. Even though now the machines are successfully doing that, the kinds of problems that are at the root of genetic and epigenetic problems, I think, are much deeper. I think it's going to take a while before AI is really solving those problems the way they're solving protein folding problems.
John Markoff: So I don't know if you knew about this project at Google, but there was a project at Google X that really mirrored what you were doing at the Machine Common Sense project. It was called Everyday Robotics.
David Moore: Did not know about that.
John Markoff: So the guy who ran it was a good friend of mine. The Google shut it down after a while. But what struck me in reading or in listening to your presentation about your framework, your evaluation framework you created for these things, is they had physical robots at Everyday Robotics.
And then DeepMind supplied them with a synthetic environment. And they would run the robots millions of times on simple tasks every night. And they would get all this machine learning data to fold back into the machine behavior in the real world.
David Moore: So wait, these were physical robots and DeepMind was providing non-physical environments?
John Markoff: And they were putting those robots, they were putting synthetic versions into these environments and then running them millions of times every night to try to evolve the behavior. And they were making great progress.
David Moore: Interesting.
John Markoff: Not enough progress to, you know, to survive in the Google ecosystem, but... And I asked him if he knew about you guys this morning and one of the things he said is if you had, he asked if you had ever considered putting an adult into the environment, because they obviously were doing some supervised learning. I didn't see one in the environment.
David Moore: There was not one. We, the psychologists on the evaluation team, really wanted to be able to do such a thing, because the kind of scaffolding that a human adult can offer a child is enormously important in that child's cognitive development. And so providing these artificial babies with that sort of context would have made a big difference in our opinion.
But the AIs needed an enormous amount of training data, and there would not have been any good way to provide them with that in a non-automated sort of way. So we were unable to do that. We did have agents in the world, but they were sort of like NPCs, non-player characters, if you know gaming. And so they were not providing the kind of context that a real human would.
John Markoff: I understand that you're about to embark on a Fulbright. What's going to occupy your time on that project?
David Moore: So it turns out that there was a siege in Northern Netherlands in the 1940s when the Nazis stopped letting any food in. And so there was a particular day when a terrible famine began, and pretty soon people were consuming less than 500 calories per day. Everybody was starving.
And the famine ended at a very specific moment also when the Allies arrived. So you wound up with this kind of natural experiment where there were women who were pregnant during the siege, and some of them were in their third trimester when they were being starved, some were in their second trimester, some in their first trimester. And so you could look and see what is the effect of undernutrition on a developing fetus.
And of course, there were all kinds of effects, but what became apparent as the years went by was that the children of those children also had effects. So there are these transgenerational consequences of the famine, and everyone thinks that the likely mechanism for those transgenerational effects was an epigenetic mechanism. And because I've been interested in epigenetics for well over a decade, I'm going to go to the Netherlands, look around in the archives in the town where I'll be living, and see what kind of data I can find.
John Markoff: And so you're also looking at descendants. So will you do survey data or survey research as well, or will it be just archival?
David Moore: It will just be archival, but I'll be interviewing some other epigeneticists in the country who are doing the kind of transgenerational work.
John Markoff: So one of the things we'd like to ask our guests is in the last year or so, is there a book or a movie or something else that you came across that really had an impact on you, that changed your perspective on the world, or you think is important and you'd like to share with us?
David Moore: So this is a poor answer to your question. But I can tell you about the best thing that I read 30 years ago. And I continue to think that it's a book that people have not read, but that everybody should. It's amazing. It's called Dynamical Systems Approach to the Development of Cognition and Action.
And when I tell people that my favorite book ever is A Dynamical Systems Approach to the Development of Cognition and Action, they look at me like I'm from outer space. But this book changed my life. It would change anyone's life. It's fantastic. And it is about how important development is and how it is that we typically think about it all wrong. It was two authors, Esther Thelen and Linda B. Smith.
But they studied motor development, and I was interested in cognitive development. And I thought motor development was not particularly interesting because babies pretty much all start to crawl, and then they stand up, and then they start to walk. And it's like it seemed kind of programmed.
And Thelen and Smith's book made it very clear that it looks that way, but it's not that way. And that, actually, the reason we all follow a similar path is because we are living in similar contexts. And she just had a systems approach that I had not really encountered before.
And I've carried it with me. And I continue to think that both to understand human development, but also to understand AIs, thinking about things in terms of systems is the better approach, as opposed to how people typically do.
John Markoff: So a final question. I noticed at the beginning of your CASBS talk, you raised the possibility of writing a book about this. Are you thinking about writing, whether it's a book about the MCS project or a more general book in the field, are you working on a longer project?
David Moore: I was just talking today with my wife about that, whether I should do that. I've been asked to do a second edition of the book on epigenetics that I wrote. And so now I have those two ideas competing for time.
I'm certainly interested in talking about what developmental science can bring to AI, because I think there are some important ideas there. But it's hard to know if the AI community is going to be ready to hear it, because the fact is, there is nothing equivalent to biological development that is occurring in machines. And so in some ways, it may be hard to take these ideas and apply them in a useful way. So I'm going to talk to my literary agent and say, which of these books do you think is the better way to spend my time?
John Markoff: Yeah. Okay. David, thank you for spending time with us today. It was fun chatting with you.
Davide Moore: Great.
Narrator: That was David Moore in Conversation with John Markoff. As always, you can follow us online or in your podcast app of choice. And if you're interested in learning more about the Center's people, projects, and rich history, you can visit our website at casbs.stanford.edu. Until next time, from everyone at CASBIS and the Human-Centered team, thanks for listening.