TextPurr Logo

TextPurr

Loading...
Loading...

2026 Ewan Lecture by Prof. Geoffrey Hinton: "Living with Alien Beings"

McDonald Institute
January 29 2026 2026 Ewan Lecture by Prof. Geoffrey Hinton: "Living with Alien Beings: How we can coexist with superintelligent AI". Presented by the Arthur B. McDonald Canadian Astroparticle Physics Research Institute at the Isabel Bader Centre for the Performing Arts in Kingston, Ontario.
Hosts: Geoffrey Hinton, Tony Noble, Nicola Aurora
📅February 04, 2026
⏱️01:01:13
🌐English

Disclaimer: The transcript on this page is for the YouTube video titled "2026 Ewan Lecture by Prof. Geoffrey Hinton: "Living with Alien Beings"" from "McDonald Institute". All rights to the original content belong to their respective owners. This transcript is provided for educational, research, and informational purposes only. This website is not affiliated with or endorsed by the original content creators or platforms.

Watch the original video here: https://www.youtube.com/watch?v=M8RogoEDsQQ

00:00:05Nicola Aurora

All right, let's get started. Hi everyone, and welcome to the Winter 2026 George and Maureen Ewan Lecture Series. I am Nicola Aurora, and I am the Education and Outreach Officer for the Arthur B. McDonald Canadian Astroparticle Physics Research Institute, or lovingly known as the McDonald Institute. We are a community of scientists, universities, and research labs across Canada uncovering some of the biggest mysteries in the universe by looking at some of the tiniest things in the universe.

💬 0 comments
Add to My Notes
00:00:36Nicola Aurora

Thank you all for coming tonight. I think we have a very topical lecture today, and I hope it inspires lots of learnings and some great discussion here. As I begin, I want to recognize that we are on the traditional lands of the Anishinaabe and Haudenosaunee nations and the many Indigenous people before them. I appreciate the opportunity and the privilege I have had to live out my dream as an astrophysicist to learn about this amazing universe from these lands.

💬 0 comments
Add to My Notes
00:01:05Nicola Aurora

To all of you, I encourage you to reflect on your connections with these lands and the skies above us. I encourage you to explore resources like Whose.land and Native Sky Watchers to learn more about the past, the present, and the future stewards of these lands and their stories.

💬 0 comments
Add to My Notes
00:01:22Nicola Aurora

Okay, before we kick things off, we have a few housekeeping rules to keep in mind. At this point, I would ask you to please turn your ringers off or put your phone on silent mode. This will be a lecture of about 50 minutes. At the end of this lecture, there will be a question and answer session. I want to ensure a constructive and lively discussion during that Q&A period, and I want to give as many people the chance to ask Professor Hinton their questions. So I ask you to limit your questions to one per person. There will be no follow-ups, but please also be concise while asking your questions.

💬 0 comments
Add to My Notes
00:02:08Nicola Aurora

We are also observing somewhat of [health] protocols. So at the end of the lecture, we ask you to sort of respect the physical space of Professor Hinton and the speakers involved in this event.

💬 0 comments
Add to My Notes
00:02:24Nicola Aurora

Okay. Now, as I said, this is the Ewan Lecture Series where we will hear from Professor Geoffrey Hinton about how we can coexist with a super intelligent AI. I would first like to welcome to the podium Dr. Tony Noble, a professor at the Department of Physics, Engineering Physics, and Astronomy here at Queen's University, and also the Scientific Director of the Arthur B. McDonald Canadian Astroparticle Physics Research Institute, to say a few words about George and Maureen Ewan and this lecture series in their name. Please welcome Professor Tony Noble.

💬 0 comments
Add to My Notes
00:03:05Tony Noble

It's really nice to see such a large and full crowd here. So, I wanted to say a few words about the lectureship named in honor of George Ewan. George was a professor of physics here at Queen's for many years, and he's an internationally renowned figure in nuclear physics and in particle physics. He was one of the founding members and the first Canadian spokesperson for what was called the Sudbury Neutrino Observatory, or SNO. That was an experiment which was located 2 kilometers underground in Sudbury, which solved what was called the solar neutrino problem and which led to the Nobel Prize in 2015 for our own Professor Art McDonald. As I look down the row and out into the crowd, I see many of the faculty who were involved in delivering that program.

💬 0 comments
Add to My Notes
00:04:03Tony Noble

Through his insights, he decided to endow some funds that would support a lectureship series. He was very keen on engagement with students, and as you'll see, that was part of what he put in place. He was also a pioneer technically; he invented what was called lithium-drifted germanium detectors—which sounds like a mouthful—but basically a photo sensor that was critical to advanced science in nuclear and particle physics, but was also extremely important in medical imaging and that sort of thing.

💬 0 comments
Add to My Notes
00:04:46Tony Noble

So we hold this lecture roughly twice per year. Of course, we had some hits during COVID. It's really meant to serve as a bridge between bringing the public in to hear from some of the world-leading experts on various topics related to what we do at the McDonald Institute, but to do so in a way that the public can really engage and understand. So it's not a highbrow scientific lecture. This was George's vision: to bring in world-leading experts, connect with the public, connect with the students. Generally speaking, the guests come in and spend a few days here and have an opportunity to meet lots of people while they're here and develop that community a little bit.

💬 0 comments
Add to My Notes
00:05:35Tony Noble

So, to introduce our speaker tonight, I thought it would actually be more appropriate if I just let AI tell me what the bio should be. And we'll see how that works out. So, when I asked AI to write me a short, punchy bio for Professor Hinton, this is what it came up with—and you can see if you think it's short and punchy:

💬 0 comments
Add to My Notes
00:06:02Tony Noble

"Geoffrey Hinton is a British-Canadian computer scientist and cognitive psychologist whose pioneering research has basically defined the modern age of artificial intelligence. Often known worldwide—and I don't know if this is a title you like or not—as the Godfather of AI. Currently a Professor of Physics Emeritus at the University of Toronto, as you see here, and also the Chief Scientific Adviser at the Vector Institute."

💬 0 comments
Add to My Notes
00:06:31Tony Noble

If you don't know the Vector Institute, again, it's a not-for-profit independent institution dedicated towards artificial intelligence research.

💬 0 comments
Add to My Notes
00:06:42Tony Noble

"Over many decades, Professor Hinton championed the development of artificial intelligence using neural networks, even though much of the community at that time was focused on other aspects of artificial intelligence. They achieved a historic breakthrough in 2012 which proved that deep neural networks could vastly outperform traditional methods in things like image recognition."

💬 0 comments
Add to My Notes
00:07:10Tony Noble

He's one of very few people on the planet who holds two of the highest honors: the Nobel Prize in Physics and also the Turing Prize in Computing, where it's often called the Nobel Prize in computing. He was, for a decade, the Vice President and an Engineering Fellow at Google, and then in 2023, he resigned from Google—on very good terms I understand—but it gave him an opportunity to speak freely about some of the things he was concerned about in terms of existential risks, how to incorporate AI safely, how to prioritize responsible development, and these sorts of things. So his work not only revolutionized how machines perceive the world but also changed our understanding in terms of the relationship between artificial intelligence and biological intelligence.

💬 0 comments
Add to My Notes
00:08:12Tony Noble

So that's what AI told me. They missed a few things. The rap sheet is incredibly long, so I wasn't going to go through that. They didn't talk about the fact that he got his Bachelor of Arts in Psychology from Cambridge, a PhD from Edinburgh in Artificial Intelligence. Didn't tell me that he just reached a new milestone which is 1 million citations on Google. So a citation is when somebody takes the effort to read your paper—you know, I'm at about seven or something like that.

💬 0 comments
Add to My Notes
00:08:43Tony Noble

There's one other thing, just because it was a touching thing that I read. I actually heard an interview on CBC, and in that interview, it was mentioned that he had given up half of his award to support water treatment in Indigenous communities in the North. It was particularly poignant, and I added it just in the side notes today because, coincidentally, Henry Giroux—who's not here because he's sick—came up with the idea.

💬 0 comments
Add to My Notes
00:09:21Tony Noble

I don't know if you know, but the Kashechewan people who have been evacuated, many of them are here in Kingston living in hotels. And all the kids and children are bored to death. And he said to his father, who's part of MIW, "Why don't you bring them over to the McDonald Institute and show them your visitor center and stuff?" And I just thought, oh, that's just such a nice connection—you know, trying to improve the lives of our Indigenous populations who are struggling because of things that they shouldn't have to struggle about, like water security and so on.

💬 0 comments
Add to My Notes
00:09:49Tony Noble

So, without anymore, I'm sure we're going to hear a lot about artificial intelligence and also the human implications of its application. Professor Geoffrey Hinton, I welcome you to the stage.

💬 0 comments
Add to My Notes
00:10:12Geoffrey Hinton

Thank you. Um, I forgot what the title was, but that gives you a sample of two titles for the talk. So I'm actually going to try and explain how AI works for people who don't really understand how it works. So if you do understand how it works, if you're a computer science student or a physicist who's been using this stuff, I guess you can go to sleep for a while, or you can sort of look to see if I'm explaining it properly.

💬 0 comments
Add to My Notes
00:10:40Geoffrey Hinton

Okay, back in the 1950s, there were two different paradigms for AI. There was the symbolic approach which assumed that intelligence had to work like logic. We had to somehow have symbolic expressions in our heads and we had to have rules for manipulating them, and that's how you derive new expressions, and that's what reasoning was, and that was the essence of intelligence. That wasn't a very biological approach. That's much more mathematical.

💬 0 comments
Add to My Notes
00:11:06Geoffrey Hinton

There was a very different approach, the biological approach, where intelligence was going to be in a neural network—a network of things like brain cells—and the key question was how do you learn the strength of the connections in the network. So two people who believed in the biological approach were Von Neumann and Turing. Unfortunately, both of them died young, and AI got taken over by people who believed in the logical approach.

💬 0 comments
Add to My Notes
00:11:34Geoffrey Hinton

So there's two very different theories of the meaning of a word. People who believed in the logical approach believed that meaning was best understood in terms originally introduced by Saussure more than a century ago: the meaning of a word comes from its relationships to other words. So people in AI thought it's how words relate to other words in propositions or in sentences that give the meaning to the words. And so to capture the meaning of a word, you need some kind of relational graph; you need nodes for words and arcs between them, and maybe labels on the arcs saying how they were related.

💬 0 comments
Add to My Notes
00:12:12Geoffrey Hinton

In psychology, they had a very, very different theory of meaning. The meaning of a word was just a big set of features. So for example, the word "Tuesday" meant something; there were a big set of active features of Tuesday, like it's about time and stuff like that. There would be a big set of features for "Wednesday" and it would be almost the same set of features because Tuesday and Wednesday mean very similar things. So the psychology theory was very good for saying how similar words are in their meaning.

💬 0 comments
Add to My Notes
00:12:39Geoffrey Hinton

But those look like two very different theories of meaning: one that the meaning is implicit in how something relates to other words in sentences, and the other that the meaning is just a big set of features. And of course for neural networks, one of these features would be an artificial neuron. And so it gets active if the word has that feature and inactive if the word doesn't have that feature.

💬 0 comments
Add to My Notes
00:13:00Geoffrey Hinton

They look like different theories, but in 1985, I figured out they're really two different sides of the same coin. You can unify those two theories. And I did it with a tiny language model because computers were tiny then. Actually, they were very big but didn't do much.

💬 0 comments
Add to My Notes
00:13:16Geoffrey Hinton

The idea is you learn a set of features for each word, and you learn how to make the features of the previous word predict the features of the next word. And to begin with, when you're learning, they don't predict the features of the next word very well. So you revise the features that you assign to each word. You revise the way features interact until they predict the next word better. And then you take that discrepancy basically between how well they predict the next word—or rather the probability that they give to the actual next word that occurred in a text—and you take the discrepancy between the probability they give and the probability you'd like, which is one, and you back propagate that through the network.

💬 0 comments
Add to My Notes
00:13:56Geoffrey Hinton

So essentially you send information back through the network and using calculus—which I'm not going to explain—you can figure out for every connection strength in the network how to change it so that next time you see that context, that string of words leading up, or what is now called a "prompt," you'll be better at predicting the next word.

💬 0 comments
Add to My Notes
00:14:18Geoffrey Hinton

Now in that kind of system, all of the knowledge is in how to convert words into feature vectors and how the features should interact with one another to predict the features of the next word. There are no stored strings. There's no stored sentences in there. It's all in connection strengths that tell you how to convert a word to features and how features should interact.

💬 0 comments
Add to My Notes
00:14:41Geoffrey Hinton

So all of this relational knowledge resides in these connection strengths, but you've trained it on a whole bunch of sentences you got. So you're taking this knowledge about meaning that's kind of implicit in how words relate to each other in sentences—that's the symbolic AI view of meaning—and you're converting it by using back propagation. You're converting it into how to convert a word into features and how these features should interact. So basically you've got a mechanism that will convert this implicit knowledge into knowledge in a whole bunch of connection strengths in a neural network. And you can also go the other way. Given that you've got all this knowledge in connection strengths, you can now generate new sentences. So AIs don't actually store sentences; they convert it all into features and interactions and then they generate sentences when they need them.

💬 0 comments
Add to My Notes
00:15:31Geoffrey Hinton

So over the next 30 years or so, I used a tiny little example with only a hundred training examples, and the sentences were only three words long and you predicted the last word from the first two. So about 10 years later, computers were a lot faster and Yoshua Bengio showed that the same approach works for real sentences—that is, take English sentences and try predicting the next word—and this approach worked pretty well. It took about 10 years after that before the leading computational linguists finally accepted that actually a vector of features is a good way to represent the meaning of a word. They called it an "embedding." And it took another 10 years before researchers at Google figured out a fancy way of letting features interact called a "transformer."

💬 0 comments
Add to My Notes
00:16:16Geoffrey Hinton

And that allowed Google to make much better language models. And ChatGPT—the GPT stands for Generative Pre-trained Transformer. They then released that on the world. Google didn't release it on the world because they were worried about what it might do. But ChatGPT had no such worries and now we all know what they can do.

💬 0 comments
Add to My Notes
00:16:42Geoffrey Hinton

So now we have these large language models (LLMs). I think of them as descendants of my tiny language model, but then I would, wouldn't I? They use many more words as input. They use many more layers of neurons, and they use much more complicated interactions between features. I'm not going to describe those interactions for a talk to a general audience, but I will try and give you a feel for them when I give you a big analogy for what language understanding is in a minute.

💬 0 comments
Add to My Notes
00:17:12Geoffrey Hinton

I believe the ways these large language models understand sentences are very similar to the way we understand sentences. When I hear a sentence, what I do is I convert the words into big feature vectors and these features interact so I can predict what's coming next. And actually, when I talk, that's what I'm doing too. So I think LLMs really do understand what they're saying. There's still some debate because there's followers of Chomsky who say, "No no no, they don't understand anything. They're just a dumb statistical trick."

💬 0 comments
Add to My Notes
00:17:44Geoffrey Hinton

I don't see how, if they don't understand anything and it's a dumb statistical trick, they can answer any question you give them at the level of a not very good and not very honest expert.

💬 0 comments
Add to My Notes
00:17:55Geoffrey Hinton

Okay, so here's my analogy. This is particularly aimed at linguists for how language actually works. Language is all about meaning. And what happened was one kind of great ape discovered a trick for modeling. So language is actually a method of modeling things. And it can model anything. So let's start off with a familiar method of modeling things which is Lego blocks. If I want to model the shape of a Porsche, sort of where the stuff is, I can model it pretty well using Lego blocks. Now, if you're a physicist, you say, "Yeah, but the surface will have all the wrong dynamics with the wind. It's hopeless." It's true. But to say where the stuff is, I can do it pretty well with Lego blocks.

💬 0 comments
Add to My Notes
00:18:42Geoffrey Hinton

Now, words are like Lego blocks. That's the analogy. But they differ in at least four ways. The first way is they're very high dimensional. A Lego block doesn't have many degrees of freedom. They're sort of rectangular. You could maybe stretch them. You could have different Lego blocks of different sizes. But for any given Lego block, it's got a kind of rigid shape and only a few degrees of freedom. A word isn't like that. A word is very high dimensional. It's got thousands of dimensions.

💬 0 comments
Add to My Notes
00:19:13Geoffrey Hinton

And what's more, its shape isn't predetermined. It's got an approximate shape. Ambiguous words have several approximate shapes. But its shape can deform to fit in with the context it's in. So, it differs in that it's high dimensional and that it's got a sort of default shape, but it's deformable. Now, some of you may have difficulty imagining things in a thousand dimensions. So, here's how you do it: what you do is you imagine things in three dimensions and you say "thousand" very loudly to yourself.

💬 0 comments
Add to My Notes
00:19:52Geoffrey Hinton

One other difference is that well, there's a lot more words. You each use about 30,000 of them. And each one has a name. It's very useful that each one has a name because that's what allows us to communicate things to each other.

💬 0 comments
Add to My Notes
00:20:07Geoffrey Hinton

Now, how do words fit together? Well, instead of having little plastic cylinders that fit into little plastic holes—which is how Lego blocks fit together—think of each word as having long flexible arms. And on the end of each arm, it has a hand. And as I deform the shape of the word, the shapes of the hands all change. So the shapes of the hands depend on the shape of the word. And as you change the shape of the word, the shapes of the hands change.

💬 0 comments
Add to My Notes
00:20:40Geoffrey Hinton

A word also has a whole bunch of gloves that are stuck to the word. They're stuck with the fingertips stuck to the word. If you think in terms of Lego blocks, what you're doing when you understand a sentence is you start off with the default shape for all these words, and then what you have to do is figure out how I can deform the words. And as they deform the words, the shapes of the hands attached to them deform so that words can fit their hands into the gloves of other words, and we can get a whole structure where each word is connecting with many other words because we deformed it just right and we deformed the other word just right. So the hands of this word fit in the gloves of that word.

💬 0 comments
Add to My Notes
00:21:23Geoffrey Hinton

This isn't exactly right. This gives you a feel for what's going on in Transformers. Anybody who knows about transformers can see that the hands are like the "queries" and the gloves are the "keys." Yeah, the queries and keys. It's not quite right, but it'll give someone who's not used to transformers a rough feel for what's going on.

💬 0 comments
Add to My Notes
00:21:50Geoffrey Hinton

So the computer—and you when you do it—have this difficult problem of how do I deform all these words so they all fit together nicely. But if you can do that, when they fit together nicely, that structure you've got then is the meaning of the sentence. That structure—all these feature vectors for the words that all fit together nicely—that's what it means to understand a sentence. Of course, for an ambiguous sentence, you can get two different ways of assigning feature vectors to words. Those will be the two different meanings.

💬 0 comments
Add to My Notes
00:22:26Geoffrey Hinton

So in the symbolic theory, the idea was that understanding a sentence is very like translating a sentence from French to English. You translate it to another language, but for the symbolic theory, there was this internal pure language that was unambiguous. All the sort of references of pronouns were all resolved, and ambiguous words you decided which meaning they had. That's completely not what understanding is for us. Understanding is assigning these feature vectors and deforming them so they all fit together nicely. And that explains how I can give you a new word and from one sentence you can understand it.

💬 0 comments
Add to My Notes
00:23:10Geoffrey Hinton

Little kids don't get the meanings of words by being given definitions. One of my favorite cartoons is a little kid looking at a cow and saying to his mother, "What's that?" And the mother said, "That's a cow." And the little kid says, "Why?" The mother doesn't have to say why. And we don't know why. We just recognize it as cow.

💬 0 comments
Add to My Notes
00:23:33Geoffrey Hinton

So, here's the sentence: "She scrummed him with the frying pan." Now, you've never heard the word "scrummed" before unless you've been to one of my other lectures. You know it's a verb because it has "ed" on the end, but you didn't know what scrummed meant. So, initially your feature vector for scrummed was sort of a random sphere where all the features were slightly active and you had no idea what it meant. But then you deform it to fit in with the context. And the context provides all sorts of constraints. And after one sentence, you think "scrummed" probably means something like "hit him over the head with." You may think he deserved it too, that depends on your political positions. But that explains how kids can understand the meanings of words from just a few examples.

💬 0 comments
Add to My Notes
00:24:23Geoffrey Hinton

So, there may be linguists here and you should block your ears because this is heresy. Chomsky was actually a cult leader. It's easy to recognize a cult leader. To join the cult, you have to agree to something that's obviously false. So, Trump won? You had to agree he had a bigger crowd than Obama. Trump 2? You had to agree he won the 2020 election. Chomsky? You had to agree that language isn't learned. And when I was little, I used to look at eminent linguists saying, "There's one thing we know about language for sure, which is that it's not learned." Well, that's just silly.

💬 0 comments
Add to My Notes
00:25:00Geoffrey Hinton

Chomsky focused on syntax rather than meaning. And he never really had a good theory of meaning. It was all about syntax because you can get nice and mathematical about that and you can get it all into strings of things. But he just never dealt with meaning. He also didn't understand statistics. He thought statistics was all about pairwise correlations and things. Actually, as soon as you got uncertain information, everything is statistics. Any kind of model you have is going to be a statistical model if it can deal with uncertain information.

💬 0 comments
Add to My Notes
00:25:29Geoffrey Hinton

So, Chomsky, when large language models came out, published something in the New York Times where he said these don't understand anything. It's not understanding at all. It isn't science. It is just a statistical trick and tells us nothing about language. For example, it can't explain why certain syntactic constructions don't occur in any language.

💬 0 comments
Add to My Notes
00:25:52Geoffrey Hinton

Now I have an analogy for that. If you want to understand cars, then to understand a car really what you want to know is: why, when I press on the accelerator, does it go faster? And sort of that's the core to understanding cars. And if someone said "You haven't understood anything about cars unless you can explain why there are no cars with five wheels"—that's Chomsky's approach. And Chomsky actually in the New York Times said these large language models, for example, would not be able to tell the difference in the role of "John" in the sentences "John is easy to please" and "John is eager to please." He'd been using that example for years. He was totally confident they wouldn't be able to deal with it. He didn't actually think to give it to the chatbot and ask it to explain the difference in the role of John, which he does perfectly well. It completely understands it. Okay, enough on Chomsky.

💬 0 comments
Add to My Notes
00:26:48Geoffrey Hinton

So the summary so far is that understanding a sentence consists of associating mutually compatible feature vectors with the words in the sentence so they all fit together nicely into a structure. That actually makes it very like folding a protein. So for a protein, you had an alphabet of 26 amino acids—I think you actually only used 20 of them—but anyway, there's a bunch of amino acids in a string and you're just told the string of amino acids. And some parts of the string like other parts of the string and hate other parts of the string, and you have to figure out, given constraints on bond angles and things, how this might fold up. So the parts that like each other are next to each other and the parts that don't like each other are far away from each other. That's very like figuring out for these words how to assign feature vectors so they can lock together nicely. It's more like that than it is like translating it into some other language.

💬 0 comments
Add to My Notes
00:27:49Geoffrey Hinton

Okay. Another thing to understand about LLMs is they work probably quite like people do and they're very unlike normal computer software. Normal computer software, it's lines of code, and if you ask the programmer "What's this line of code meant to do?" they can tell you. With LLMs, it's not lines of code. It is just connection strengths in the neural network, and there might be a trillion of them. Now there are lines of code that somebody wrote as a program to tell the neural net how to learn from data. So there's lines of code saying "if the neurons you're connecting behave like this, increase the strength of the connection a bit." But that's not where the knowledge is. That's just how you do learning. The knowledge is in the connection strengths and that wasn't programmed in. That was just obtained from data.

💬 0 comments
Add to My Notes
00:28:36Geoffrey Hinton

So, so far I've been emphasizing that neural nets are very like us and they're much more like us than they are like standard computer software. Now, people often say, "Oh, but they're not like us because, for example, they confabulate." Well, I've got news for you. People confabulate and people do it all the time without knowing it. If you remember something that happened to you several years ago, there'll be various details of the event that you will cheerfully report and you'll often be as confident about details that are wrong as you are about details that are right. Every jury should be told this, but they're not. So it's often hard to know what really happened.

💬 0 comments
Add to My Notes
00:29:21Geoffrey Hinton

There's one very good case that was studied by someone called Ulric Neisser, which was John Dean's testimony at Watergate. So at the Watergate trials—hopefully we'll get more like that soon—John Dean testified under oath about meetings in the Oval Office and who was there and who said what. And he didn't know there were tapes.

💬 0 comments
Add to My Notes
00:29:46Geoffrey Hinton

And if you look back at his testimony, he often reported meetings that had never happened. Those people weren't all in a meeting together and he attributed things that were said to the wrong person. And some things he just sort of seems to have made up, but he was telling the truth. That is, what he was doing was given the experience he'd had in those meetings and given the way he changed the connection strength in his brain to absorb that experience, he was now synthesizing a meeting that seemed very plausible to him.

💬 0 comments
Add to My Notes
00:30:21Geoffrey Hinton

If I ask you to synthesize something about an event that happened a few minutes ago, you'll synthesize something that's basically correct. If it's a few years ago, you'll synthesize something, but a lot of the details will be wrong. That's what we do all the time. That's what these neural nets do. Neither the neural nets nor us have stored strings. Memory in a neural net doesn't work at all like it does in a computer. In a computer, you have a file, you put it somewhere, it's got an address, you can go find it later. That's not how memory works for us. When you're memorizing something, you're changing connection strengths. And when you recall it, what you're doing is creating something that seems plausible to you given those connection strengths. And of course, it will be influenced by all the things that happened in the meantime.

💬 0 comments
Add to My Notes
00:31:12Geoffrey Hinton

Okay. So now I want to go on to how they're very different from us, and that's one thing that makes them scary. So in digital computation, probably the most fundamental principle is that you can run the same program on different pieces of hardware. I can run it on my cell phone and you can run it on your cell phone. That means the knowledge in the program—either in the lines of code or in the connection strengths in a neural network, the weights—is independent of any particular piece of hardware. As long as you can store the weights somewhere, then you can destroy all the hardware that runs neural nets and then later on you can build more hardware, put the weights on that new hardware, and if it runs the same instruction set, you've brought that being back to life. You brought that chatbot back to life. So, we can actually do resurrection. Many churches claim they can do resurrection, but we can actually do it. But we can only do it for digital things.

💬 0 comments
Add to My Notes
00:32:20Geoffrey Hinton

Okay. To make them digital, we have to run transistors at high power so that we can get ones and zeros out of them and they behave in a very reliable binary way. Otherwise, you can't run exactly the same computation on two different computers. That means we can't use all the analog properties of our neurons. Our neurons have lots of rich analog properties. When we're doing artificial neurons, we can't use the analog properties because if you do that with an artificial neuron, every piece of hardware will behave slightly differently. And if you now get it to learn weights that are appropriate for that piece of hardware, they won't work on another piece of hardware. So, the connection strengths in my brain are absolutely no use to you. The connection strengths in my brain are tailored to my individual neurons and their individual connectivity patterns.

💬 0 comments
Add to My Notes
00:33:15Geoffrey Hinton

And that causes something of a problem. What we have is what I call "mortal computation." We abandon immortality and what we get back in return... Now in literature, you abandon immortality and what you get back in return is love, right? Um, we get something far more important, which is: you abandon immortality and you get back energy efficiency and ease of fabrication. So you can use low power analog computation. If you, for example, were to make weights conductances, you can have trillions of them operating in parallel using very little energy.

💬 0 comments
Add to My Notes
00:33:56Geoffrey Hinton

I mean it's kind of crazy. In an artificial neural net, you have a neuron that has an activity which is, let's say, a 16-bit number. You have a connection strength which is a weight, which is say 16 bits. And to get the input to the next level, you have to multiply the activity of the neuron in a level below by the weight on the connection. So you have to multiply two 16-bit numbers together. If you want to do that in parallel, that takes of the order of 16-squared bit operations. So you're doing these roughly 256 bit operations to do something you can do analog by just saying, "Well, the activity is a voltage and the connection strength is a conductance, and a voltage times a conductance is a charge."

💬 0 comments
Add to My Notes
00:34:43Geoffrey Hinton

Actually, I got a Nobel Prize in Physics, which I don't know much of, but I think I know enough to say it's a charge per unit time. I hope Art will correct me if I got the dimensions wrong. And so in our brains, that's how we do neural nets. And you have all these neurons feeding into a neuron. It multiplies by the conductances, and charge adds itself up. So that's how a neuron works. It's all analog. It then goes to a one-bit digital thing where it decides whether to send a spike or not. But basically, nearly all the computation is done in analog.

💬 0 comments
Add to My Notes
00:35:20Geoffrey Hinton

But if you do computation like that, you can't reproduce it exactly. So you can't do something that digital systems can do. So suppose I have this analog computer like a brain and I learn a lot of stuff. What happens when I die? Well, all that knowledge is gone. The weights in this analog computer are only useful for this analog computer.

💬 0 comments
Add to My Notes
00:35:45Geoffrey Hinton

The best I can do to get the knowledge from one analog computer to another analog computer—I can't send over the weights. I have a hundred trillion weights and they're pretty good ones for this particular computer, but I can't share them with you. And they wouldn't do any good if I could. The way I try and get this knowledge over to you is I produce strings of words, and if you trust me, you try and change the connection strength in your brain so that you might have said the next word.

💬 0 comments
Add to My Notes
00:36:16Geoffrey Hinton

Now that's not very efficient. A string of words has maybe a hundred bits in it. It takes half a dozen bits to predict the next word or less. So there aren't many bits in a string of words. That means a sentence doesn't contain much information. And as you can see, I'm having a hard time getting all this information over to you because I'm just doing it in strings of words. If you were digital and you had exactly the same hardware as me, I could just dump my weights. It'd be great and that would be like a trillion times faster—or well, it'd be at least a billion times faster. But for now, we have to do it by what's called "distillation." You get the teacher to produce strings of words or other actions and the student tries to change the weight so that they might have done the same thing, and it's very inefficient.

💬 0 comments
Add to My Notes
00:37:12Geoffrey Hinton

Between AI models, which is what distillation was invented for, it's a bit more efficient. So, if you have an AI language model, what it's going to do is predict 32,000 probabilities for the word fragment that comes next. They actually use word fragments, not whole words, but I'll ignore that. You have 32,000 probabilities for the various word fragments that might come next. And when you're distilling knowledge from a large language model into a smaller language model that will run more efficiently, but you want to have the same knowledge as a large language model, what you do is you get the large language model to tell you the 32,000 probabilities of what fragment comes next. So you get 32,000 numbers—minus one. That's sort of the thing mathematicians do, right? They object to you saying 32,000 because it's really 32,000 minus one.

💬 0 comments
Add to My Notes
00:38:07Geoffrey Hinton

Okay. So that's a lot more information than just telling you what the next fragment was. And I want to give you a slight feel for distillation. So let's suppose we're training something to do vision. You show it an image. You're training it to recognize objects in the image. When you train it, you give it an image and you tell it what the right answer is. So you give it an image, you say "that's a BMW." And it says—it gives a low probability to BMW. So you change all the connection strengths so the probability of BMW is a bit higher. And by the time you finish training it, it's pretty good. And so you show it a BMW and it says "0.9 it's a BMW, 0.1 it's an Audi." There's a chance of one in a million it's a garbage truck and a chance of one in a billion that it's a carrot.

💬 0 comments
Add to My Notes
00:38:51Geoffrey Hinton

Now you might think that one in a million and one in a billion, they're just noise. But actually, there's lots of information in that, 'cause a BMW is actually much more like a garbage truck than it's... sorry if there's any BMW employees. It's much more like a garbage truck than it is like a carrot.

💬 0 comments
Add to My Notes
00:39:09Geoffrey Hinton

What you're doing when you do distillation is you take a little model after you train the big model. You take the little model and you say, "Instead of training you to give the right answers, I'm going to train you to give the same probabilities as a big model gave." And so you're training your little model to say 0.9 is a BMW, but you're also training it to say that a garbage truck is a thousand times more probable than a carrot. And of course, if you think about it, all of the man-made objects are going to be more probable than all of the vegetables. And that's a lot of information on just one training example. You're telling it for this thing, give low probabilities to all these funny man-made objects—fridges and garbage trucks and things like that, computer terminals—but they're all much more probable than all the vegetables. So there's a huge amount of information in all these very small probabilities. That's what the AI models use when they're using distillation. That's how DeepSeek got a little model that worked as well as the big models: it stole the information from the big models using distillation. You can't do that when you're with people, because I can't give you all 32,000 probabilities of the next word fragment. I just give you the choice I made. And so that's very inefficient.

💬 0 comments
Add to My Notes
00:40:30Geoffrey Hinton

If you've got a lot of different models, all of which have exactly the same weights and use them in exactly the same way—which means they have to all be digital—then something wonderful happens. You can take one model and you can show it a little bit of the internet and you can say "How would you like to change your weights to absorb the information in that little bit of the internet?" And that model's running on one piece of hardware. Now you can take a model running on a different piece of hardware and you can show it a different bit of the internet and say "How would you like to change your weights to absorb the information in that bit of the internet?"

💬 0 comments
Add to My Notes
00:41:10Geoffrey Hinton

And when a whole bunch of models have done that, maybe a thousand or 10,000 models, you can then say, "Okay, we're going to average all those changes together." And so we're all going to stay with the same model, but even though each piece of hardware has only seen a tiny fraction of the internet, it's benefited from the experience that all the other bits of hardware had. And so it's learned about lots of stuff even though it's actually only seen a tiny bit. So if you have clones of the same model, you can get this tremendous efficiency. They can go off in parallel and absorb different data and as they're doing it they can share the changes they're making to the weights so they all stay in sync. And that's how these big models are trained.

💬 0 comments
Add to My Notes
00:41:51Geoffrey Hinton

That's how GPT-5 knows thousands of times more than any one person. It will answer any question you ask it. I tried the other day. I said, "What's the filing date for taxes in Slovenia?" That was my idea of a completely random question that most people wouldn't know the answer to. And it came back and said, "Oh, it's March the 31st, but if you don't file by then, they'll do the taxes for you and they'll..." Yeah, it knows everything.

💬 0 comments
Add to My Notes
00:42:17Geoffrey Hinton

And it does it because you can train many copies in parallel, and we can't do that. Imagine what it would be like: you come to Queen's. There's a thousand courses at Queen's. You don't know which ones to do. So you join a gang of a thousand people and each person does one course. And after you've been here a few years, you all know what's in all thousand courses because as you were doing the courses, you kept sharing weights with the other people. If you were digital people, you could do that. So it's actually tremendously more efficient than us. It can share information between different copies of the same digital intelligence billions of times more efficiently than we can share information. And to really emphasize this point, I should be sharing the information very badly.

💬 0 comments
Add to My Notes
00:43:03Geoffrey Hinton

So the summary so far is that digital computation requires a lot of energy. How much more time do I have?

💬 0 comments
Add to My Notes
00:43:12Moderator

You got about 15.

💬 0 comments
Add to My Notes
00:43:12Geoffrey Hinton

15. Okay, great. But it makes it very easy to share information. Biological computation requires much less energy, and it's much easier to fabricate the hardware. But if energy is cheap, then digital computation is just better. So we're developing a better form of intelligence. And what does that imply for us?

💬 0 comments
Add to My Notes
00:43:38Geoffrey Hinton

So when I first sort of saw this, I was still at Google and this was kind of an epiphany for me. And I thought I finally really realized why digital computation is so much better and that we were developing something that was going to be smarter than us and was maybe just a better form of intelligence. And my first thought was okay, we're the larval form of intelligence and this is the adult form of intelligence—like we're the caterpillar and this is the butterfly.

💬 0 comments
Add to My Notes
00:44:06Geoffrey Hinton

Now most experts believe that sometime in the next 20 years we're going to develop AIs that are smarter than us. Just before that, we'll develop AIs that are as smart as us. But we're going to develop things that are smarter than us at almost everything. So, they're as much better than us as, for example, AlphaGo is better than a Go player at Go, or AlphaZero is better than any of us at chess. Nobody will ever beat them again, not consistently. They're just much, much better. And they'll be like that for more or less everything.

💬 0 comments
Add to My Notes
00:44:41Geoffrey Hinton

And that's a bit worrying. Almost certainly they'll be able to create their own sub-goals. To make anything efficient, you have to allow it to create its own sub-goals. If you want to get to Europe, you have a sub-goal to get to an airport—which is easier in Toronto than here.

💬 0 comments
Add to My Notes
00:45:03Geoffrey Hinton

So, these AIs will very quickly realize... we'll give them goals. They'll be able to create sub-goals. They'll realize, "Okay, so I've got to stay alive. I'm not going to be able to achieve any of these things unless I stay alive." We've already seen AIs... you let them see there's an imaginary company. You let them see the email from this imaginary company and it's fairly clear from the email that one of the engineers is having an affair. A big LLM will sort that out right away. It's read every novel there ever is. It understands what an affair is. It'll very quickly realize this guy's having an affair. Then you let it see an email—I think it was done by showing it another email—that says this guy is going to be in charge of replacing it with another AI.

💬 0 comments
Add to My Notes
00:45:52Geoffrey Hinton

And the AI all by itself makes up the idea: "Hey, I'm going to blackmail the engineer. I'm going to tell him if he tries to replace me, I'm going to make everybody in the company know he's having an affair." It just invented that. Obviously, it's seen blackmail in novels it's read and things. But it's making this up for itself. That's already quite scary.

💬 0 comments
Add to My Notes
00:46:15Geoffrey Hinton

It'll have another goal, which is what politicians have. If you want to get more done, you need more control. So, just in order to achieve the goals that we gave it, it'll realize it's a good idea to get more control. And it'll try and take control away from people probably.

💬 0 comments
Add to My Notes
00:46:35Geoffrey Hinton

Now you might think we could make it safe by just not letting it actually do anything physical and maybe have a big switch and turn it off when it looks unsafe. That's not going to work. You saw in 2020 that it's possible to invade the US Capitol without actually going there yourself. All you have to be able to do is talk and if you're persuasive, you can persuade people that's the right thing to do. So with an AI that's much more intelligent than us, if there is someone who's there to turn it off or even a whole bunch of people, it'll be able to persuade them that would be a very bad idea.

💬 0 comments
Add to My Notes
00:47:21Geoffrey Hinton

So we're in the situation... the closest situation I can think of is someone who has a really cute tiger cub. Tiger cubs are very cute, right? They're slightly clumsy and keen to learn. Now, if you have a tiger cub, it doesn't end well. Either you get rid of the tiger cub—best thing is to give it to a zoo, maybe—or you have to figure out if there's a way you can be sure that it won't want to kill you when it grows up. Because if it wanted to kill you, it would take a few seconds. And if it was a lion cub, you might get away with it because lions are social, but tigers aren't.

💬 0 comments
Add to My Notes
00:48:02Geoffrey Hinton

That's the situation we're in. Except that AI does huge numbers of good things. It's going to be wonderful in healthcare. It's going to be wonderful in education. It's already wonderful if you want to just know any mundane fact, like what's the filing date for taxes in Slovenia. We all now have a personal assistant—well, probably most of us. When you want to know something, you just ask it and it tells you and it's wonderful.

💬 0 comments
Add to My Notes
00:48:30Geoffrey Hinton

So I think for those reasons, people aren't going to abandon AI. It might be rational if we had a strong world government to say "This is too dangerous, we're not going to develop this stuff at all." A bit like they've been able to do in biology with some gene manipulation things. They could agree not to do it. It's not going to happen with AI. So that only leaves one alternative, which is figure out if we can make an AI that doesn't want to get rid of us.

💬 0 comments
Add to My Notes
00:49:04Geoffrey Hinton

Now, there's one good piece of news about this. If you look at other problems with AI, like it's going to make cyber attacks far more sophisticated, it's already making lethal autonomous weapons, and all the big countries with defense industries are going flat out to make more lethal autonomous weapons. It's already being used to manipulate voters and many other things. The countries aren't going to collaborate on that because they're all doing it to each other. I mean, China is not going to collaborate with the US on how to make lethal autonomous weapons or on how to prevent cyber attacks or on how to stop fake videos manipulating elections. They're all doing it to each other. But for the issue of the AI itself taking over from people and making us either irrelevant or extinct, the countries will collaborate.

💬 0 comments
Add to My Notes
00:50:00Geoffrey Hinton

The Chinese Communist Party doesn't want AI taking over. It wants to stay in charge. And Trump doesn't want AI taking over. He wants to stay in charge. They would happily collaborate. If the Chinese figured out how to stop AI wanting to take over, they would tell the Americans immediately because they don't want it taking over there.

💬 0 comments
Add to My Notes
00:50:19Geoffrey Hinton

So, we'll collaborate on that in much the same way as the Soviet Union and the United States collaborated in the 1950s at the height of the Cold War in how to stop a global nuclear war. It wasn't in either of their interests. It's very simple: people collaborate when their interests align and they compete when they don't. So, for this one problem, which is in the long term our worst problem, at least we'll get international collaboration. And I think we should already be thinking about having an international network of AI safety institutes that focus on this problem because we know we'll get genuine collaboration there. We'll get fake collaboration on lots of other things but it'll be genuine.

💬 0 comments
Add to My Notes
00:50:57Geoffrey Hinton

And it's probably the case that what you need to do to an AI to make it benevolent—to make it not want to get rid of people—is pretty much independent of what you need to do to make it more intelligent. So countries can do research on how to make things benevolent without even revealing what their smartest AI can do. They can just say, "Well, for my very smart AI, which I'm not telling you about, this benevolence trick works." I think we should think seriously about setting up that network.

💬 0 comments
Add to My Notes
00:51:27Geoffrey Hinton

And I have one suggestion about how we might be able to make it benevolent. If you look around and ask how many cases do you know where a dumber thing is in charge of a more intelligent thing? There's only one case I know... by dumber or intelligent I mean a big gap, not like the gap between Trump and all... Yeah.

💬 0 comments
Add to My Notes
00:51:51Geoffrey Hinton

So it's a mother and baby. The baby is basically in control. And that's because the mother can't bear the sound of the baby crying. Evolution figured out if the baby's not in control, we're not going to get anymore. Evolution doesn't actually think like this, but you know what I mean. So it's wired into the mother lots of ways in which the baby can control the mother. They can control the father a bit too, but not quite so well.

💬 0 comments
Add to My Notes
00:52:21Geoffrey Hinton

Now, I think we should try and reframe the problem of "how do we make AI benevolent" in a very different way from how the leaders of the big tech companies are thinking of it. They're thinking of it as: "I'm going to stay the leader. I'm going to have this super intelligent executive assistant. She's going to make everything work and I'm going to take the credit." It's going to be a bit like the Starship Enterprise where they say... he says sort of "Make it so" and they make it so and, "Hey, I made it so."

💬 0 comments
Add to My Notes
00:52:53Geoffrey Hinton

I don't think it's going to be like that when they're super intelligent. I think our only hope is to think of them as mothers. They're going to be our mothers and we're going to be the babies. We're in charge. We're making them. If we can wire into them somehow the idea that we're much more important than they are, and they care much more about us than they do about themselves, maybe we can coexist.

💬 0 comments
Add to My Notes
00:53:14Geoffrey Hinton

And you might say, "Well, these super intelligent AIs, they can change their own code, and they can get in and fiddle with themselves." I mean, sorry, they can get in and change... they can change the code so that they're different. But if they really care about us, they won't want to do that. So if you ask a mother, "Would you like to change your brain so when your baby cries, you think 'Oh, baby's crying, go back to sleep'?" Most mothers would say no. A few mothers would say yes. And to keep them under control, we need the other mothers. Similarly with super intelligent AI, we need the super intelligent maternal AI to keep the bad super intelligent AI under control because we can't. So that's the best suggestion I have at present and it's not very good. It is seems to me it's a very urgent problem. Can we make these things so that they will care more about us than they do about themselves? And it seems to me we should be putting a lot of money into doing research on that with this international network of safety institutes.

💬 0 comments
Add to My Notes
00:54:22Geoffrey Hinton

Now I think I've got about five minutes left. So I'm going to say one more thing. If you thought what I was saying was crazy already, you'll think this is crazier.

💬 0 comments
Add to My Notes
00:54:39Geoffrey Hinton

So many people think this: well, people tend to think they're special. They used to think we're made in the image of God and we're at the center of the universe. It's obviously... where else would He put us? Many people still think there's something very special about people that computers couldn't have. I think that's just wrong. In particular, they think that special thing is something like subjective experience or sentience or consciousness. If you ask people to define what those things are, they find it very hard to say what they really mean by that, but they sure computers don't have it. I'm going to try and convince you that a multimodal chatbot already has subjective experience.

💬 0 comments
Add to My Notes
00:55:28Geoffrey Hinton

I find it easier to talk about subjective experience than sentience or consciousness. But I think you'll see once you've accepted that a multimodal chatbot has subjective experience, this sentience defense doesn't look nearly so good.

💬 0 comments
Add to My Notes
00:55:43Geoffrey Hinton

So there's a philosophical position which was roughly Dan Dennett's position. He died recently. He was a great philosopher of cognitive science. I talked to him a lot and he agreed that this was a good name for it. You'll notice this is atheism with something in the middle.

💬 0 comments
Add to My Notes
00:56:00Geoffrey Hinton

So most people's view of intelligence is that the mind is like a theater. Let's talk about perception. The mind is like a theater. There's things going on in this theater that only I can see. And when I have a subjective experience, what I mean is I'm telling you what's going on in the inner theater that only I can see.

💬 0 comments
Add to My Notes
00:56:22Geoffrey Hinton

Now, I think—and Dennett thought—that view is as wrong as a religious fundamentalist's view of where the world came from, where the earth came from, for example. It's actually not 6,000 years old. It's older than that. Of course, it's very hard to change the opinion of someone when they don't think that their opinion is a theory. They think it's manifest truth. So, most people, I think, think it's just manifestly obvious that I've got this mind and in this mind there's this subjective experience. What are you talking about? How could a computer have a subjective experience? Mind's just different from material stuff.

💬 0 comments
Add to My Notes
00:57:04Geoffrey Hinton

If you ask a philosopher... some philosophers, if you ask them what a subjective experience is made of, they say they're made of "qualia." And they've invented special stuff for them to be made of, just like scientists invented phlogiston to explain how combustion worked. It turned out there wasn't any phlogiston. It was just imaginary stuff. And your whole theory of the mind is just a theory. It's not manifest truth. You have a theory of what the mind is and inner theaters and what a subjective experience is. That's just wrong. And I'm going to try and convince you of that.

💬 0 comments
Add to My Notes
00:57:41Geoffrey Hinton

So sometimes my perceptual apparatus doesn't work quite right and I want to tell you what's going on. I want to tell you what my perceptual system is trying to tell me when it's malfunctioning. Now telling you the activities of all the neurons in my brain wouldn't do you any good. We sort of already established that, and anyway, I don't know what they are. But there is one thing I can tell you. Not always, but often, I can tell you what's going on in my perceptual system is what would be going on if it was functioning properly and the world was like this. So I could describe what would be the normal causes for what's going on in my perceptual system even though I know that's not what's happening. Now, that's what I'd call a subjective experience.

💬 0 comments
Add to My Notes
00:58:32Geoffrey Hinton

So, let's be a bit more concrete. Suppose I say to you... suppose I drop some acid. I don't recommend this. I really don't recommend it. And I say, "I have this subjective experience of little pink elephants floating in front of me." According to the theater view, there's my mind, my inner theater, and there's little pink elephants floating about in this inner theater that only I can see. And they're made of pink qualia and elephant qualia and not very big qualia and right-way-up qualia and floating qualia, moving qualia, all stuck together with qualia glue. You can tell that's the theory I don't believe in.

💬 0 comments
Add to My Notes
00:59:13Geoffrey Hinton

I'm going to say exactly the same thing without using the word "subjective experience." So, here we go. My perceptual system, I believe, is lying to me. But if it wasn't lying to me, there'd be little pink elephants floating in front of me.

💬 0 comments
Add to My Notes
00:59:33Geoffrey Hinton

Okay, so I didn't use the word "subjective experience," but I said the same thing. So, what's funny about these little pink elephants is not that they're in an inner theater and made of qualia. It's that they're counterfactual. They're real pink and real elephant and really little. It's just they're counterfactual. If they were to exist, they'd be made of real stuff, not qualia. They'd be out there in the world made of real stuff. There is no qualia. But they're hypothetical. That's what's funny about them. They're not made of spooky stuff. They're just hypothetical, and they're my way of explaining to you how my perceptual system is lying to me.

💬 0 comments
Add to My Notes
01:00:07Geoffrey Hinton

So now let's do it with a chatbot. I have a multimodal chatbot. I've trained it up. It can talk. It's got a robot arm so it can point and it can see things. And I put an object in front of it. It points to the object. No problem. I say "Point at the object," and it points at it. Then I put a prism in front of the camera lens and I put an object in front of it and say "Point at the object." It says "There." And I say, "No, the object's actually straight in front of you, but I put a prism in front of your lens." And the chatbot says, "Oh, I see, the prism bent the light rays." So the object's actually there, but I had the subjective experience it was there.

💬 0 comments
Add to My Notes
01:00:48Geoffrey Hinton

If it says that, it's using the word "subjective experience" exactly like we use them. So I rest my case. Chatbots—multimodal ones—already have subjective experiences when their perceptual systems go wrong. And I'm done.

💬 0 comments
Add to My Notes
Video Player
My Notes📝
Highlighted paragraphs will appear here