Demis Hassabis: Agents, AGI & The Next Big Scientific Breakthrough
Y CombinatorDisclaimer: The transcript on this page is for the YouTube video titled "Demis Hassabis: Agents, AGI & The Next Big Scientific Breakthrough" from "Y Combinator". All rights to the original content belong to their respective owners. This transcript is provided for educational, research, and informational purposes only. This website is not affiliated with or endorsed by the original content creators or platforms.
Watch the original video here: https://www.youtube.com/watch?v=JNyuX1zoOgU
Continual learning, long-term reasoning, some aspects of memory, these are still unsolved. I think all of these are going to be required for AGI. Depending on what your AGI timeline is, you know, mine's like 2030 or something like this, then if you start off on a deep tech journey today, you have to just consider AGI appearing in the middle of that journey.
It's not bad, necessarily, but you have to take that into account. You have to have an active system that can actively solve problems for you to get to AGI. So, agents are that path, and I think we're just getting going.
Demis Hassabis has had one of the most unusual careers in tech. He was a chess prodigy as a kid, then designed his first hit video game, Theme Park, at 17. He then went back to school, got a PhD in cognitive neuroscience, published foundational work on how memory and imagination work in the brain, and then in 2010 co-founded DeepMind with one mission: solve intelligence. And I think they've done it.
Since then, his lab has gone on to do things most people thought were decades away. AlphaGo beat a world champion at Go. AlphaFold cracked protein structure prediction, a 50-year grand challenge in biology, and they gave it away for free to every scientist on Earth. That work won him the Nobel Prize in chemistry last year.
Today, Demis leads Google DeepMind, where he's building Gemini and pushing toward the same goal he set when he was a teenager: artificial general intelligence. Please welcome Demis Hassabis.
So, you've been thinking about AGI longer than almost anyone. When you look at the current paradigm, large-scale pre-training, RLHF, chain of thought, how much of the final architecture for AGI do you think we already have, and what's fundamentally missing right now?
Well, first of all, thanks Garry for that great introduction, and it's great to be here. Thanks for welcoming here. It's an amazing space, actually. I'll have to come back here often. Very inspiring that you all get to work in this space.
So, the question is, I think the components that you just mentioned, I'm pretty sure will be part of the final architecture for AGI. So, I think they've come such a long way now, and we've proven out so many things about what they can do. I can't see a world in which we'll sort of realize in a couple of years this was a dead end. That doesn't make sense to me. But, there still might be one or two things missing on top of what we already know works.
So, continual learning, long-term reasoning, some aspects of memory, these are still unsolved. And how to get the systems to be more consistent across the board. I think all of these are going to be required for AGI.
Now, it might be that the existing techniques can just scale up to that with some incremental innovation. But, it could be that there's still one or two big ideas left that need to be cracked. I don't think it's more than one or two if they are out there. And I think, you know, my betting is about 50/50 if that's the case. So, of course at Google DeepMind we work on both those things.
I guess that's really working with a bunch of agentic systems, the wildest thing to me is to what degree it's the same weights over and over. So, this idea of continual learning is so interesting because like you know, right now we're sort of cobbling it together with duct tape, you know, these dream cycles at night and things like that.
Yeah. It's pretty cool the dream cycles, and we used to think about this with consolidation with episodic memory. It's actually what I studied for my PhD is how the hippocampus works and integrates, you know, new knowledge gracefully into the existing knowledge base. So, the brain does that amazingly well. It does it through, you know, during sleep, especially things like REM sleep, replaying back episodes that are important so that you can learn from it.
In fact, our very first Atari program, DQN, one of the ways it was able to master Atari games was by doing experience replay. So, we sort of borrowed that from neuroscience and replayed successful trajectories many times, you know, that's way back in 2013 now in the dark ages of AI. It was a really important thing, and I agree with you, we're kind of using duct tape right now.
So, like shove it all in the context window.
This... but this seems a bit unsatisfying, right? And actually, even though we're working on machines, not biological brains, and so you potentially you could have, you know, millions or tens of millions size context window or memory, and it can be perfect, there's still a cost to looking it up and finding the right thing that's actually relevant for the specific decision you've got to make right now.
And that's non-trivial that cost, even if you can potentially store it all. I think there's actually a lot of room for innovation in areas like memory.
Yeah. I mean, the wild thing is that it feels like a million token context window is actually bigger than... I mean, it's plenty big, honestly. You can do so—
Well, it's plenty big for most things that it should be useful. I mean, if you think about the context window is sort of equivalent to working memory. You know, humans have, we have like a few digits, you know, it's like a dozen digits maybe, you know, average of seven. We got a million or you know, 10 million context windows, but the problem is is that we're trying to store everything in that. You know, things that aren't important, things that are wrong. It's pretty brute force currently, and that doesn't seem right.
And then the problem is if you're now trying to process live video, and you're just going to naively record all the tokens, then actually a million tokens isn't that much. It's only like 20 minutes. So, actually you need more if you want something that's going to understand what's going on in your life over maybe a month or two.
DeepMind has historically leaned into reinforcement learning and search, AlphaGo, AlphaZero, and MuZero. How much of that philosophy is actually embedded in how you're building Gemini today? Is RL still underrated?
Yeah, I think potentially it is. It sort of goes in waves. You know, we've worked on agents since the beginning of DeepMind. In fact, that's what we said we were working on. So, all of the Atari work and AlphaGo and most specifically, they're agent systems, and what we meant by that is systems that are able to, you know, accomplish goals on their own and make active decisions and make plans. And so, of course we were doing it in the domain of games to make it tractable, and then doing increasingly complex games, things like StarCraft after AlphaGo, AlphaStar. So, we basically did all the games that were out there.
And then of course, the question is can you generalize those models to be world models or models of language, not just models of simple games or even complex games? And that's what the last few years has been about. But really, you can think of a lot of the things we're doing today, all the leading models with thinking modes and chain of thought reasoning, as aspects of what was sort of pioneered with AlphaGo coming back now.
And I actually think there's a lot of work we did back then that is relevant today, and we're sort of relooking at some of those old ideas at scale today in a more general way. Including things like Monte Carlo tree search and other ways of augmenting the RL on top of the reinforcement learning we're ready to do today. And I think a lot of those ideas both from AlphaGo and AlphaZero are really, really relevant to where we are with today's foundation models. And I think a lot of that is what we're going to see of the advances the next few years.
One question I would have... like obviously today you need bigger and bigger models to be smarter and smarter, but then we're also seeing distillation working, and then smaller models can be like quite a bit faster. I think you know, you guys have incredible flash models that are like... you're finding that they're 95% as good as the frontier and at like 1/10 the price. Is that right?
I think that's one of our core strengths. I mean, you have to build the biggest models to have the frontier capabilities. But, I think one of our biggest strengths has been distilling and packing that power into smaller and smaller models very quickly. Obviously, we invented the kind of distillation process and people like Jeff and Oriol and others, and we're still world experts in that. And we also have a huge need to do it because we've got to serve the biggest probably AI surfaces there are.
Obviously, there's search with AI overviews, then there's the Gemini app, and now increasingly every single product at Google has, you know, Maps and YouTube and so on has some aspect of Gemini or Gemini-related technology in it. And so, that's billions of users, more than a dozen billion user products, and they have to be served extremely fast, extremely efficiently, and cheaply, and with low latency. So, that gives us a really important incentive to make these Flash and even smaller models, Flash-Lite models, extremely efficient. And hopefully that ends up then being really useful for many of the workloads that all of you use for.
I'm curious about how much smarter these smaller models can actually be. Like are there limits to the distillation process? Like could a 50B or 400B model be as smart as like a frontier model is today?
Yeah, I don't see any... I don't think we've got to any kind of, or at least none of us know yet if we've got to any kind of informational limit. I mean, maybe at some point that will be the case where there's just an information density that we can't get beyond. But I think for now, the assumption we make is that, you know, a year later after one of our leading pro models or frontier models goes out, half a year later, a year later you'll have them in the really tiny, almost edge models.
And you also see some of that goodness in our Gemma models, which hopefully you're all enjoying. Our Gemma models, which I think are really amazing power for their sizes. So, again, that uses a lot of these distillation techniques and the idea of how to make things really efficient in these very small models. So, I don't really see any limit yet in terms of like some kind of theoretical limit. I think we're still pretty far off of that.
That's amazing. I mean, that is really good. You know, one of the weirder things that we're seeing right now is like engineers can do like 500 to 1,000 times the amount of work that they were doing like 6 months ago, I guess. I mean, the people in this room, there are people who are doing about like 1,000x the work that like Steve Yegge talks about this. It's like 1,000x the work that a Google engineer from the 2000s was doing.
I think it's very exciting. I mean, I think the small models have many uses. One is obviously cost, but the speed can allow, you know, if you think about coding even or other things, you can iterate a lot faster also, especially if you're collaborating with the system. I think there's a lot of need for having fast systems that maybe are not quite frontier, like you said, like 95%, 90%, but that's plenty good enough and actually gain back more than the 10% on the iteration speed.
And then the other big thing I think is running these things on the edge again for efficiency reasons, but also for privacy and security reasons, too. If you think about different devices that you might run these systems on that, you know, process very personal information, you can also think about robotics as well. You know, robots in your house. I think you're going to want very efficient, very powerful local models, which maybe are orchestrated, you know, with some bigger models, frontier models that are in the cloud, but you only delegate to that in certain circumstances. And perhaps you, you know, process all of the audio visual feed, let's say, locally and that stays local. I could imagine that would be a very good sort of end state.
Y Combinator School is back. We're hand-selecting the most promising builders in the world and flying them out to San Francisco for July 25th and 26th to discuss the cutting edge of tech. Apply now for a spot.
Okay, back to the video. Going back to context and memory, models currently stateless, but, you know, what would the developer experience even be like for someone who's using a continual learning model? Like, you know, any idea like how you'd steer it?
I think it's really interesting. I think that not having continual learning currently is one of the things holding back agents from doing full tasks. You know, I think they're really useful for aspects of tasks right now and you can patch them together and do some really cool things, but they don't adapt well with the context that you're in. And I think that's the missing piece for them being really kind of fire and forget and they'll figure it out themselves. You know, I think they need to be able to learn about the specific context that you're going to put them in. So, I think we have to crack that to get full general intelligence.
Where are we on reasoning? So, models can do really impressive chain of thought now, but they still fail on things a smart undergrad wouldn't. What specifically needs to change and what progress do you expect in reasoning?
There's a lot of innovation left in the thinking paradigms, I would say. Again, I think we're doing fairly simplistic things, fairly brute force. One could imagine, I think there's a lot of scope, for example, in monitoring the chain of thought, maybe interjecting midway through a thought process. I often get the impression with our systems and our competitor systems that they're almost overthinking. They're almost getting into sort of loops of things.
Like, one thing I sometimes like to do is play chess against Gemini. And, you know, all the leading foundation models are pretty poor at games, which is quite interesting. It's very cool to kind of look at the thinking traces because obviously these can be well-understood, you know, I can tell quite quickly if it's going off on a tangent and it's very provable what the thinking is doing, whether it's useful or not. And so, what we see is that, you know, sometimes it will consider a move, it will realize it's a blunder, but it can't find anything better, so it kind of goes back to that move and does it anyway.
So, you know, you just shouldn't be seeing that happening in a very precise reasoning system. So, there's just huge gaps I think still, but it may only be one or two tweaks that are required to fix those kind of gaps, just to be clear. But I think it's pretty obvious they are there. And that's why you get this kind of jagged intelligence.
You know, on the one hand, it can solve gold medal problems in IMO, which is super hard, but on the other hand, as we've all seen, it can still make basic elementary maths errors if you pose the question in a certain way, right? Or elementary reasoning errors. So, there's just something to me about the almost an introspection about its own thought process that I feel like there's something maybe missing there.
Agents are really big. Some would say they're hyped. I personally think they're just getting started. It's totally insane.
What does DeepMind's internal research tell you about where agent capabilities actually are right now versus, you know, sort of the hype out there?
I think we all... I agree with you. I think we're just at the beginning. You have to have an active system that can actively solve problems for you to get to AGI. That was always clear to us. So, agents are that path. And I think we're just getting going. I think all of us are getting used to how do we best work, and you're leading the way in a lot of this in your own personal experiments. I'm sure many of you are doing that.
I think how do you incorporate it into your workflow in a way that isn't just sort of a nice-to-have, but actually starting to do fundamental things? My impression is at the moment we're all experimenting, we're experimenting a lot of things, but we're only in maybe the last couple of months starting to find the really valuable places. And the technology is probably only getting good enough for that to be the case, right? Where it's not a kind of toy, nice demonstration, but actually really adding value to your time and efficiency.
I'd often wonder... I see a lot of people working on like setting off, you know, dozens of agents for like 40 hours, but I'm not sure I've seen the output of that yet quite justify that level of input going in. But I think it will come. So, I still think we're in the experimentation phase. We haven't seen a AAA game that tops the app store charts that was sort of live coded yet, right?
I've seen and I've programmed, and I'm sure we've all done little nice demonstrations and it's like, amazing, I can do a prototype of Theme Park in half an hour now, which took me 6 months back when I was 17. It's kind of mind-blowing. And I got this feeling if I spent the whole summer working on it, you could make something really incredible, but it still needs craft and, you know, human sort of soul into it and taste. I think that's something you have to make sure you still bring that to whatever you're building.
And I think it still shows like it's not quite there yet because why haven't we seen a kid making a hit game that sells 10 million copies, right? That should be possible given the effort that's gone in. So, something's still somehow missing. Maybe it's to do with the process or maybe it's to do with the tools. I'm not quite sure. You all will probably know better than me because I'm sure you're all experimenting on that, but I haven't seen the result yet which I would expect once this is really delivering that full value. Which I think will come in the next 6 to 12 months.
Some of it is like how much of it will be autonomous versus... I mean, I don't think we'd see autonomous first. We would actually probably see people in this room operating at 1,000x and then—
That's what you should see first. And then many of you, you know, there'll be like games companies or, you know, other types of companies that have built some kind of best-selling app, best-selling game using these tools. That's what you should see first. And then more of that will get automated.
I mean, some of it is like there's a human in there and then the human doesn't want to say that the agents did it yet.
I think part of it might be though that... we want to discuss like creativity. What I often say about that is like if we look at the things we've done like AlphaGo. So, obviously very famously, you'll know about move 37 in game two. And for me, I was waiting for a moment like that to start the science projects like AlphaFold. We started AlphaFold like the day we got back from Seoul, which is 10 years ago now. So, I'm going to Korea after this to celebrate the 10-year anniversary of AlphaGo.
But it's not enough to come up with move 37. Like, that's pretty cool, very useful, but can it invent Go? That's what I... I want a system that can invent Go if you give it a high-level description, you know, like a game you can learn the rules of in 5 minutes, but it takes many lifetimes to master. It's beautiful aesthetically, but you can play it in a few hours in an afternoon. So, you know, maybe you could imagine that would be the high-level description I would give, and then I'd want the return, the thing I get back is Go. Right? And clearly today's systems I think can't do that. So, the question is why? And I think there's something still missing there.
Well, someone in this room might make it.
Then the answer would be there's nothing missing. It just was the way we were using the systems. And that might actually be the answer. It might be that today's systems are capable of that with a brilliant enough creative person using it and providing that impetus, that the soul of the project, and being able to probably be au fait enough with the tools to almost be at one with the tools. I could imagine that would be happening if you experimented with the tools all day and all night like probably many of you are doing, and you combine that with proper deep creativity. Something more incredible could be done.
Switching gears to open source. I mean or open, open weights. I mean, the recent release of Gemma, you're making highly capable open and accessible ones that can actually run locally. What do you think that means for... will AI be something that is in the hands of the users instead of primarily in the cloud, and does that change who gets to, you know, build with these models?
We're huge proponents in general of open source and open science, and you mentioned AlphaFold at the beginning, you know, we put that all out there for free, and all of our science work even still today we publish in, you know, the big journals. We wanted to create world-leading models for their sizes. Right, and so that's what we hopefully we've done with Gemma and we're, you know, very committed to that path and hopefully you all experiment and build and enjoy using Gemma. I think it's been like 40 million downloads now just in, you know, two and a half weeks. So we're really excited about that.
And I also think it's important for there to be Western stacks on open source, you know, obviously a lot of the Chinese models are excellent and they're currently world-leading in open source, and we think Gemma is very competitive for its sizes in all those respects.
And for us, I mean, there is a question of resources, talent, and compute. Like nobody has enough spare compute to just make two, you know, frontier models at maximum size, right, with different attributes. So that's pretty difficult. But also for now, what we've decided is that our edge models, the things we want to use for Android and glasses and robotics, it's best that they're open models because they're vulnerable anyway once you put them out on the surfaces. So they might as well be actually fully open. Right?
So we've sort of made a decision to kind of unify that at the kind of, we call it nano-size level. So that actually works for us strategically as well. And you know, we hope as many people as possible build on it, and of course, we'll be building on that, too.
Earlier before we came on I got to show you a demo of my version of Samantha from Her, which is harrowing for me to try to demo something to you. And it worked, which is amazing. Gemini was built multimodal and I spent a lot of time with a bunch of the models, and I mean the depth of the context and the tool use with speech directly to model... like there's nothing like it, bar none, like the best one actually.
Yeah. I think that's still a slightly underappreciated aspect of the Gemini series, is we started it being multimodal from the start. That made it a little bit more difficult actually to begin with than just focusing on text for example. But I believe we're going to gain from that in the long run. And I think we're seeing that now for things like world model building, so stuff like Genie that we build on top of Gemini.
I think it's going to be really important for things like robotics. So this is why Gemini robotics, which many of you probably played around with. I think it's going to be built on multimodal foundation models, the robotics models, and we think we have a competitive advantage with Gemini being so strong at multimodal. We're using it increasingly in things like Waymo.
But also if you imagine devices and digital assistants that come with you into the real world, you know, maybe on your phone or glasses or some other device. It needs to understand the physical world around you and intuitive physics and the physical context you're in, and that's what our systems are extremely good at, and I think you found that's why you've enjoyed using it in your setup. We're planning to continue on that, and I think we're far and away the strongest models on those types of problems.
So the cost of inferences are dropping fast. What becomes possible when inference is essentially free, and how does that change what your team is actually optimizing for?
Yeah, I'm not sure inference will ever be essentially free. I mean there's sort of Jevons paradox and other things about like, I think we'll just end up using... all of us will end up using whatever we can get our hands on, and you could imagine millions of agents, swarms of agents working together on things. So that's one way to use the inference, or you could imagine single agents or smaller groups of agents thinking in multiple directions and then ensembling that. So we're experimenting with all these things. Probably many of you are.
All of that will use up any inference I think that's available. I mean one day maybe it can be almost cost zero, certainly the energy if we solve fusion or you know, superconductors or optimal batteries or some set of those things which I think we will do with material science. Energy costs will be essentially zero, but there'll still be the physical creation of the chips and other things. There'll be some bottleneck at least for the next few decades I think. And so if that's the case, there'll still be rationing on the inference side. You still have to use it, I think, efficiently.
Yeah. Well, luckily the smaller models are getting smaller and smarter, which is fantastic. We got a lot of bio and biotech founders in the audience. I can see a few. AlphaFold 3 took us beyond proteins to a broad spectrum of biomolecules. How close are we to modeling full cellular systems, or is that still a fundamentally harder problem in a class of its own?
Well, Isomorphic Labs, which we spun out from DeepMind after we did AlphaFold 2, which is going amazingly well. It's trying to build out... not just AlphaFold. It's just one piece of the drug discovery process as many of you know, but we're trying to do the adjacent biochemistry and chemistry to design the right compounds with the right properties and so on. We'll have some big announcements very soon to talk about on that front. I think that's going really well.
Eventually you want a whole virtual cell. So I've talked about this in many of my science talks about a full working simulation of a cell that you can perturb, and then the outputs of that would be close enough to experimental that it's useful. Right? You could skip out a lot of the search steps and generate lots of synthetic data to train other models that then would predict things about real cells. And I think we're about 10 years away probably from something like a virtual cell, like a full virtual cell.
You know, we're starting out... we're working on the DeepMind science side on a virtual nucleus. Cell nucleus first, because it's relatively self-contained. The trick with all of these things is can you pick a slice of the complexity? You know, eventually you want to model a human body, but can you model it down to the right level of detail and what slice can you take out of it that will be self-contained enough that you can kind of model and approximate the inputs and outputs into that self-contained system and then just focus on the self-contained system. So a nucleus is quite interesting from that perspective.
Then the other issue is just there's not enough data yet. So you need data, and I talked to various top scientists who work on electron microscopes and other imaging things. If we could image a live cell without killing the cell, that would be game-changing obviously, because then you could convert it into a vision problem which we would know how to solve. Right? And but at the moment there are at least I'm not aware of any techniques that can give you a kind of nanometer resolution but without destroying it in a live dynamic cell. So you can see all the interactions.
Right? You can take static images at that resolution obviously, really detailed now and that's quite exciting. But it's not enough to turn it just into a complex vision problem. So that's one way it could be solved. So it could be a hardware-driven, data-driven solution, or it could be that we build better learned simulators of these dynamical systems. So that's the more modeling way of solving it.
You've been looking at all kinds of science and not just bio. There's material science, drug discovery, climate modeling, mathematics. If you had to rank which scientific domain will transform the most dramatically the next 5 years, what's in your list?
They're all so exciting, and that's why I mean that for me has been my main passion and always the reason why I've worked on AI for my whole career for 30 plus years now, is to use AI as the ultimate tool. I always thought AI would be the ultimate tool for science and to invite such advanced scientific understanding, scientific discovery, and things like medicine and just our understanding of the universe around us.
So actually when you mentioned our original way we used to articulate our mission statement, which is still the way we think about it, is there was two steps to it. Step one was solve intelligence, i.e., build AGI, and then step two was use it to solve everything else. We had to change that a bit over time because people were like, "Do you really mean solve everything else?" And we did mean that, and I think people are sort of understanding what that means today.
But specifically I was meaning solve other what I call root node problems in science. So areas of science that would unlock whole new branches or avenues of discovery. And AlphaFold is the prototypical example of what we want to do. So over 3 million researchers around the world, pretty much every biology researcher in the world uses AlphaFold now. And I was told by some of my pharma executive friends that almost every drug discovered from now on will have used AlphaFold at some point in its drug discovery process. So that's something we're very proud of and it's the sort of impact that we hope to have with AI.
But, I do think it's just the beginning. I don't really see any area of science or engineering that this won't be able to be helpful with. And the ones you mentioned, I think we're almost at an AlphaFold one moment. So, we've got very promising results, but it's not quite solved the grand challenge yet in that domain. But, I think we're going to have a lot to talk about in the next couple of years on all those areas you mentioned, materials, which I think is very exciting, all the way to mathematics.
In science, I mean, it feels Promethean. It's like here is this capability—
And I think so. I mean, of course, along with that, including the parable of Prometheus, we have to also be careful with how we use that and what we use it for, and also the misuse that can happen with those same tools.
A lot of people in this room are trying to build companies applying AI to science. For them, what's the difference between a startup that actually advances the frontier in your view versus one that's just wrapping an API around a foundation model and calling it AI for science?
Well, look, I think one of the things I would recommend... I'm trying to think about, and I think you mentioned this to me before. What would I do today myself if I was sitting in your place at Y Combinator, you know, looking at things? One thing you have to do is obviously intercept where the AI tech is going. So, that's one hard part of it. But, I do think there's huge scope for combining where AI is going with some other deep technology area. I just think that that sweet spot is whether it's materials or medicine or other really hard areas of science.
I think that those kinds of interdisciplinary teams, especially if it involves the world of atoms as well, there's not going to be a shortcut to that at least in the foreseeable future. Those areas are pretty safe from just getting swamped by whatever the next update is to the foundation models. So, I think if you're looking for things like that, that's one of the more defensible areas, I would say.
And I've always loved deep tech, so I'm kind of biased towards deep tech things. I think nothing that's really long-lasting and worthwhile is easy. And so, I've always been drawn to deep technologies. Obviously, AI was like that back in 2010 when we started out, right? It was thought to just... we know it doesn't work, kind of thing, is what I was told by investors, and even in academia, it was considered to be a very niche subject that we sort of tried in the '90s and we know doesn't work.
But, if you have belief and conviction in your idea why it's different this time or what special combination from your background that you have... ideally you're an expert in both those areas, both the machine learning and the other area you're applying it to, or you can create a founding team with that expertise, I think there's huge impact to be made there and huge value to be built there.
That's a really important message. I mean, it's easy to forget. Like basically, once you've done it, you've done it. But, before you've done it, people are arrayed against you. I mean, no one believes in it.
Oh, sure. Which is why I think you've also got to work in things that you're genuinely passionate about. Like for me, I would have worked on AI no matter what happened. I just decided from a very young age it was the thing that could be the most consequential thing I could think of. It's turned out that way, but it might not have. Maybe we would have been 50 years too early. And it was also the most interesting thing I could think of working on. And so, I would still be working on AI today even if we were still, you know, in a little garage somewhere and it still wasn't quite working. I would have found some way of continuing to work on it, maybe back in academia or something.
So, I mean, AlphaFold was like an example of a spike that you pursued and it worked. You know, what makes a scientific domain ripe for an AlphaFold style breakthrough? And is there a pattern, a certain objective function, like—
The way I... I should write this up at some point when I have 5 minutes spare, but the lesson I've learned from all the Alpha projects we've done, specifically AlphaGo and AlphaFold, is I think the techniques we have and the problems I like to look for are great if the situation can be described as a massive combinatorial search space. The more massive, the better in some ways. So, no brute force or special case algorithm will solve it. And that's true of Go moves and of different configurations of proteins, far more than the atoms in the universe, both of those.
And then, you have a clear objective function. So, you can think of it as minimizing the free energy in the proteins or winning the game of Go. So, you need to be able to specify your objective function clearly so you can hill climb. And then, enough data and/or a simulator that can generate you lots of in-distribution synthetic data.
If those things are true, then I think with today's methods, you can go a long way into tackling and finding the kind of needle in the haystack that you need to for the solution that you're trying to look for. And I think just drug discovery, by the way, is in the same way. Right? There is a compound out there that would solve this disease if one could find it. If one could only find it. Right? And that wouldn't have any side effects and so on. And as long as the laws of physics allow it, then the only question is how do you find it in an efficient way, in a tractable way? I think we showed for the first time, actually, with AlphaGo that these systems could find those kinds of needles in the haystack. In that case, you know, the perfect Go move.
I guess to get a little meta, I mean, we're talking about humans using these methods to create AlphaFold, but then there's a meta level, which is humans using AI to explore the space of possible hypotheses. How close are we to AI systems that can do genuine scientific reasoning, not just pattern matching on data?
We're close. Um, we're working on these general systems like that. Like, we have this system called Co-Scientist, and we have other algorithms like AlphaFold that can go a little bit beyond what the basic Gemini will do. And obviously, all the frontier labs are experimenting in this way. I've yet to see anything so far, and we all tinker with the same things, you know, some math problems that are a little bit harder than IMO and so on. I haven't seen anything yet that is a true genuine, you know, massive discovery.
That's my personal opinion. I think it's coming. I think it may be related to this earlier thing we discussed about creativity and actually going on beyond the bounds of what's known. So, clearly that's just not pattern matching at that point because there is no pattern to match to. And it's a bit more than extrapolation. It's some kind of analogical reasoning. And I don't think these systems have that, or at least we're not using them in the right way to do that.
So, the way I often say that in science is, can it come up with a hypothesis that's really interesting, not just solve one? When I say just, we're not talking about just like solving the Riemann hypothesis or something. This would be obviously amazing. Well, one of the Millennium Prize problems and maybe we're a couple of years out from doing that.
But, I'd like to solve $P=NP$. That's my favorite one. But, can you... but even harder than that would be to come up with a new set of Millennium Prize problems that were regarded by top mathematicians to be as, you know, deep and meaningful and worthy of a lifetime of study and effort to solve. Right? I think that's another level harder. And we don't have, you know, I still don't think we know how to do that.
I don't think it's magical, though. I do think these systems will be eventually able to do that. Maybe we're missing one or two things. And then, the way we would test that is, you know, sometimes I call it my Einstein test, which is, can you train a system with the knowledge cutoff of 1901 and then will it come up with what Einstein did in 1905, including special relativity, his annus mirabilis. Can it do that? Right? And then, I think we could run that test. Maybe we should just run that test and keep seeing if that's possible. Once that is, then I think we're on the verge of these systems being able to invent something new, truly novel.
So, last question. For the people who are deeply technical in this room who want to work on something, you know, even close to the scale of what you have created... you know, it's one of the largest AI efforts in the world and you've been a pioneer for all these years. So, for that, I think everyone in this room thanks you and the folks at DeepMind very, very deeply from the bottom of our hearts. Thank you. What's the thing that you know now about building at the frontier that you wish you'd known at 25?
I think we covered some of it in terms of actually you work out that going after hard problems and deep problems is no more difficult in some ways than going after a shallower, simpler, more superficial problem. They're just differently difficult. There's different things that are hard about each of those things. But, I think given life's very short and you only have so much time and energy, you might as well put your life force into something that will really make a difference if you hadn't done it, if you hadn't been there to push it. So, I would just think of it through that lens.
And then, the other thing is... and we talked about deep tech, and I love interdisciplinary work, and I think that's going to be even more prevalent in the next few years in combinations of fields and finding the connections between those fields. And it's going to be even easier to do that with AI.
And then, the only other thing I would say is depending on what your AGI timeline is—mine's like 2030 or something like this—then if you start off on a deep tech journey today, usually you're talking about a 10-year journey for true deep tech, in my opinion. So then now you have to just consider AGI appearing in the middle of that journey. So, what does that mean? It's not bad, necessarily, but you have to take that into account. Right? Will it be able to leverage it? What will the AGI system do with it?
And it goes a little bit back to what you said earlier about AlphaFold and general AI systems. So, one thing I can see happening is Gemini, Claude, or one of these general systems making use of AlphaFold-like specialized systems as tools. I don't think we're going to have it just in one giant brain because it will have too much regression. If I put all the proteins into Gemini, that wouldn't make sense. We don't need Gemini to do protein folding. Going back to your information efficiency, it will definitely affect its language skills or something like that, right, in a bad way. So, much better I think is to have really good general-purpose tool usage models that will then maybe even train those specific tools, but they would be in a separate system.
So, I think that's kind of interesting to think through the implications of that and then what you might build today. Also, physical things, too, like what kinds of factories would you build? What sorts of finance systems, and so on. So, I just think you need to really take that seriously and imagine what that world would look like, and then build something that would be useful if that comes in halfway through.
Demis Hassabis everyone.