Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters (Podcast with Transcript)

Craig Deville

May 27, 2024 • 57 min read

In this engaging podcast episode, Mark Zuckerberg, the founder of Facebook, now Meta, sits down with Dwarkesh Patel. They delve into a wide range of topics, including the future of technology, the role of AI in modern society, and the challenges and opportunities that lie ahead. Zuckerberg shares insights into the development of MetaAI, the company's advanced AI model, and its potential impact on various sectors. Patel, on the other hand, brings his unique perspective to the discussion, providing valuable insights based on his extensive experience in [his field of expertise]. This episode is a must-listen for anyone interested in technology, AI, and the future of digital communication.

Mark Zuckerberg:
That's not even a question for me, whether we're going to go take a swing at building the next thing. I'm just incapable of not doing that. There's a bunch of times when we wanted to launch features and then Apple's just like, Nope, you're not launching that. I was like, that sucks. Are we set up for that with AI where you're going to get a handful of companies that run these closed models that are going to be in control of the APIs and therefore going to be able to tell you what you can build.

Mark Zuckerberg:
Then when you start getting into building a data center that's like 300 megawatts or 500 megawatts or a gigawatt, Just no 1 has built a single gigawatt data center yet. From wherever you sit, there's going to be some actor who you don't trust. If they're the ones who have like the super strong AI, I think that that's potentially a much bigger risk.

Dwarkesh Patel:
Mark, welcome to the podcast.

Mark Zuckerberg:
Hey, thanks for having me. Big fan of your podcast.

Dwarkesh Patel:
Oh, thank you. That's very nice of you to say. Okay, so let's start by talking about the releases that will go out when this interview goes out. Tell me about the models. Tell me about MetaAI.

Dwarkesh Patel:
What's new? What's exciting about them?

Mark Zuckerberg:
Yeah, sure. So, you know, I think the main thing that most people in the world are gonna see is the new version of MetaAI. So it's, and the most important thing about what we're doing is the upgrade to the model. We're rolling out Llama 3. We're doing it both as open source for the dev community and it is now gonna be powering MetaAI.

Mark Zuckerberg:
So, there's a lot that I'm sure we'll go into around Lama 3, but I think the bottom line on this is that with Lama 3, we now think that MetaAI is the most intelligent AI assistant that people can use that's freely available. We're also integrating Google and Bing for real-time knowledge. We're going to make it a lot more prominent across our apps. So, you know, basically, you know, at the top of WhatsApp and Instagram and Facebook and Messenger, you'll just be able to, you know, use the search box right there to ask any question. And there's a bunch of new creation features that we added that I think are pretty cool, that I think people enjoy.

Mark Zuckerberg:
And I think animations is a good 1. You can basically just take any image and animate it. But I think 1 that people are going to find pretty wild is it now generates high quality images so quickly. I don't know if you've gotten a chance to play with this, that it actually generates it as you're typing and updates it in real time. So you're like typing your query and it's kind of like honing in on and you know, it's like, okay, here, you know, show me a picture of a cow, okay, in a field with mountains in the background, it's just like everything is populous.

Mark Zuckerberg:
Eating macadamia nuts, drinking beer and like, and just like it's updating the image in real time. It's pretty wild. I think people are going to enjoy that. So yeah, so that I think is, that's what most people are going to see in the world, right? We're rolling that out, you know, not everywhere, but we're starting in a handful of countries and we'll do more over the coming weeks and months.

Mark Zuckerberg:
So that's, that I think is going to be a pretty big deal. And I'm really excited to get that in people's hands. It's a big step forward for Met AI. But I think, if you wanna get under the hood a bit, the llama 3 stuff is obviously the most technically interesting. So we're basically, for the first version, we're training 3 versions, an 8 billion and a 70 billion, which we're releasing today, and a 405 billion dense model, which is still training.

Mark Zuckerberg:
So we're not releasing that today. But the 8 and 70, I'm pretty excited about how they turned out. I mean, they're leading for their scale. We'll release a blog post with all the benchmarks so people can check it out themselves. And obviously it's open source so people get a chance to play with it.

Mark Zuckerberg:
We have a roadmap of new releases coming that are gonna bring multimodality, more multilinguality, bigger context windows to those as well. And then, you know, hopefully sometime later in the year, we'll get to roll out the 4 or 5, which I think is, you know, in training, it's still training, but for where it is right now in training, it is already at around 85 MMLU. And just we expect that it's going to have leading benchmarks on a bunch of on a bunch of the benchmarks. So I'm pretty excited about all of that. I mean, the 70 billion is is is is great, too.

Mark Zuckerberg:
I mean, We're releasing that today. It's around 82 MMLU and has leading scores on math and reasoning. So I think just getting this in people's hands is gonna be pretty wild.

Dwarkesh Patel:
Oh, interesting. Yeah, that's the first time

Mark Zuckerberg:
hearing it

Dwarkesh Patel:
in a benchmark. That's super impressive.

Mark Zuckerberg:
Yeah, and the 8 billion is nearly as powerful as the biggest version of Llama 2 that we released. So it's like the smallest llama 3 is basically as powerful as the biggest llama 2.

Dwarkesh Patel:
Okay. So before we dig into these models, I actually want to go back in time. 2022 is I'm assuming when you started acquiring these H100s. Or you can tell me when. Yeah.

Mark Zuckerberg:
You're like

Dwarkesh Patel:
stock price is getting hammered. People are like, what's happening with all this CapEx? People aren't buying the Metaverse. And presumably you're spending that capex to get these H100s. How back then, how did you know to get the H100s?

Dwarkesh Patel:
How did you know we'll need the GPUs?

Mark Zuckerberg:
I think it was because we were working on reels. So, you know, we got into this situation where, you know, we always want to have enough capacity to build something that we can't quite see that we're on the horizon yet. And we got into this position with Reels where we needed more GPUs to train the models, right? It was this big evolution for our services where instead of just ranking content from people who you follow, or your friends and whatever pages you follow, we made this big push to basically start recommending what we call unconnected content. Basically, content from people or pages that you're not following.

Mark Zuckerberg:
So now kind of the corpus of kind of content candidates that we could potentially show you expanded from, you know, on the order of thousands to on the order of hundreds of millions. So completely different infrastructure. And we started working on doing that and we were constrained on basically the infrastructure that we had to catch up to what TikTok was doing as quickly as we would have wanted to. So I basically looked at that and I was like, hey, we have to make sure that we're never in this situation again. So let's order enough GPUs to do what we need to do on reels and ranking content and feed, but let's also, let's double that, right?

Mark Zuckerberg:
Cause again, like our normal principle is there's gonna be something on the horizon that we can't see yet.

Dwarkesh Patel:
Did you know it would be AI?

Mark Zuckerberg:
Well, we thought it would be, we thought it was going to be something that had to do with training large models, right? I mean, but at the time, I thought it was probably going to be more something that had to do with content. But I don't know. I mean, it's almost just the pattern matching and running the company is there's always another thing, right? So I'm not even sure I had at that time, I was so deep and just trying to get the recommendations working for Reels and other content.

Mark Zuckerberg:
Because I mean, that's just such a big unlock for Instagram and Facebook to now being able to show people content that's interesting to them from people that they're not even following. But yeah, that ended up being a very good decision in retrospect. Yeah, yeah. Okay, And it came from being behind. So then it wasn't like I was, you know, it wasn't like, oh, I was so far ahead.

Mark Zuckerberg:
Actually, most of the times I think where we kind of make some decision that ends up seeming good is because we messed something up before and just didn't want to repeat the mistake.

Dwarkesh Patel:
This is a total detour, but I actually wanna ask about this while we're on this. We'll get back to AI in a second. So you didn't sell for 1 billion, but presumably there's some amount you would have sold for, right? Did you write down in your head, I think the actual valuation of Facebook at the time is this and they're not actually getting the valuation right? If you're after $5 trillion, of course you would have sold.

Dwarkesh Patel:
So, like, how did you think about that choice?

Mark Zuckerberg:
Yeah, I don't know. I mean, look, I think some of these things are just personal. I don't know at the time that I was sophisticated enough to do that analysis.

Dwarkesh Patel:
But I

Mark Zuckerberg:
had all these people around me who were making all these arguments for how like a billion dollars was, it's like, here's the revenue that we need to make and here's how big we need to be. And like, it's clearly so many years in the future. Like, it was, it was very far ahead of where we were at the time. And I don't know, I didn't really have the financial sophistication to really even engage with that kind of debate. I just, I think I sort of deep down believed in what we were doing.

Mark Zuckerberg:
And I did some analysis. I was like, okay, well, what would I go do if I wasn't doing this? It's like, well, I really like building things. And I like helping people communicate. And I like understanding what's going on with people and the dynamics between people.

Mark Zuckerberg:
So I think if I sold this company, I'd just go build another company like this. And I kind of like the 1 I have. So I mean, you know, why, right? But I don't know. I think a lot of the biggest bets that people make are often just based on conviction and values.

Mark Zuckerberg:
It's actually usually very hard to do the analyses trying to connect the dots forward.

Dwarkesh Patel:
Yeah. So you've had Facebook AI research for a long time. Now it's become seemingly central to your company. At what point did making AGI or whatever, however you consider that mission, at what point is that like, this is a Cree priority of what Meta is doing.

Mark Zuckerberg:
Yeah. I mean, it's been a big deal for a while. So we started FAIR about 10 years ago. And the idea was that along the way to general intelligence or AI, like full AI, whatever you want to call it. There are going to be all these different innovations and that's going to just improve everything that we do.

Mark Zuckerberg:
So we didn't kind of conceive it as a product. It was more kind of a research group. And over the last 10 years, it has created a lot of different things that have basically improved all of our products and advanced the field and allowed other people in the field to create things that have improved our products too. So I think that that's been great. But there's obviously a big change in the last few years when, you know, chat GPT comes out, the diffusion models around image creation come out.

Mark Zuckerberg:
And like, I mean, this is some pretty wild stuff, right? That I think is like pretty clearly going to affect how people interact with like every app that's out there. So at that point, we started a second group, the Gen.ai group, with the goal of basically bringing that stuff into our product. So building leading foundation models that would sort of power all these different products. And initially when we started doing that, the theory at first was, hey, a lot of the stuff that we're doing is pretty social, right?

Mark Zuckerberg:
So, you know, it's helping people interact with creators, helping people interact with businesses so the businesses can sell things or do customer support or basic assistant functionality for, whether it's for our apps or the smart glasses or VR, like all these different things. So initially it wasn't completely clear that you were gonna need kind of full AGI to be able to support those use cases. But then through working on them, I think it's actually become clear that you do, right? In all these subtle ways. So for example, for Llama 2, when we were working on it, we didn't prioritize coding.

Mark Zuckerberg:
And the reason why we didn't prioritize coding is because people aren't gonna ask MetaAI a lot of coding questions in WhatsApp. Now they will. Well, I don't know. I'm not sure that WhatsApp is like the UI that people are going to be doing a lot of coding questions. So we're like, all right, look, in terms of the things that, you know, or Facebook or Instagram, or, you know, those, those different services, maybe, maybe the website or metadata.ai that we're launching, I think.

Dwarkesh Patel:
But, but the, The thing

Mark Zuckerberg:
that was sort of, I think, has been a somewhat surprising result over the last 18 months is that it turns out that coding is important for a lot of domains, not just coding. So even if people aren't asking coding questions to the models, training the models on coding helps them just be more rigorous and answer the question and kind of help reason across a lot of different types of domains. Okay, so that's 1 example where it's like, all right, so for Lama 3, we like really focused on training it with a lot of coding because It's like, all right, that's going to make it better on all these things, even if people aren't answering, aren't asking primarily coding questions. Reasoning, I think, is another example. It's like, okay, yeah, maybe you want to chat with a creator or, you know, you're a business and you're trying to interact with a customer.

Mark Zuckerberg:
You know, that interaction is not just like, okay, the person sends you a message and you just reply, right? It's like a multi-step interaction where you're trying to think through, how do I accomplish the person's goals? And, you know, a lot of times when a customer comes, they don't necessarily know exactly what they're looking for or how to ask their questions. So it's not really the job of the AI to just respond to the question. It's like, you need to kind of think about it more holistically.

Mark Zuckerberg:
It's really becomes a reasoning problem, right? So if someone else solves reasoning or makes good advances on reasoning, and we're sitting here with a basic chat bot, then our product is lame compared to what other people are building. At the end of the day, we basically realized we've got to solve general intelligence. We just upped the ante and the investment to make sure that we could do that.

Dwarkesh Patel:
So the version of Lama that's going to solve all these use cases for users, is that the version that will be powerful enough to replace a programmer you might have in this building?

Mark Zuckerberg:
I just think that all this stuff is going to be progressive over time.

Dwarkesh Patel:
But, in case, Lama 10.

Mark Zuckerberg:
I mean, I think that there's a lot baked into that question. I'm not sure that we're replacing people as much as giving people tools to do more stuff.

Dwarkesh Patel:
Is the programmer in this building 10x more productive after a lot of time? I

Mark Zuckerberg:
would hope more,

Dwarkesh Patel:
but

Mark Zuckerberg:
no. I mean, look, I don't believe that there's a single threshold of intelligence for humanity, because people have different skills. At some point, I think that AI is probably going to surpass people at most of those things, depending on how powerful the models are. But I think it's progressive. And I don't think AGI is 1 thing.

Mark Zuckerberg:
I think you're basically adding different capabilities. So multimodality is a key 1 that we're focused on now, initially with photos and images and text, but eventually with videos. And then because we're so focused on the metaverse, kind of 3D type stuff is important. 1 modality that I'm pretty focused on that I haven't seen as many other people in the industry focus on this is sort of like emotional understanding. Like I mean, so much of the human brain is just dedicated to understanding people and kind of like understanding your expressions and emotions and that that's like its own whole modality, right?

Mark Zuckerberg:
That I mean, you could say, okay, maybe it's just video or image, but it's like clearly a very specialized version of those 2. So there's all these different capabilities that I think you want to basically train the models to focus on, as well as getting a lot better at reasoning, getting a lot better at memory, which I think is kind of its own whole thing. I mean, I don't think we're going to be primarily shoving context or things into a query context window in the future to ask more complicated questions. I think that there'll be kind of different stores of memory or different custom models that, that are maybe more personalized to people. But I know that I think that these are all just different capabilities.

Mark Zuckerberg:
And then obviously making them big and small. We care about both because, you know, we want to, you know, if you're running something like meta AI, then we have the ability to, that's pretty server-based. But we also want it running on smart glasses and, you know, there's not a lot of space in smart glasses. So, you, you want to have someone that's very efficient for that.

Dwarkesh Patel:
What is the use case that if you're doing tens of billions of dollars worth of inference or even eventually hundreds of billions of dollars worth of inference, using intelligence in an industrial scale, what is the use case? Is it simulations? Is it the AIs that will be in the metaverse? What will we be using the data centers for?

Mark Zuckerberg:
Our bet is that this is basically going to change all of the products. Right. So I think that there's going to be a kind of MetaAI general assistant product. And I think that that will shift from something that feels more like a chat bot, where it's like you just ask a question and it kind of formulates an answer, to things where you're increasingly giving it more complicated tasks and that goes away and does them. So that's going to take a lot of inference, it's going to take a lot of compute in other ways too.

Mark Zuckerberg:
Then I think that there's a big part of what we're going to do that is, like interacting with other agents for other people. So whether it's businesses or creators, I guess a big part of my theory on this is that there's not just going to be like 1 singular AI that you interact with, because I think every business is going to want an AI that represents their interests. They're not going to want to primarily interact with you through an AI that is going to sell their competitors' products. I think creators is going to be a big 1. There are about 200 million creators on our platforms.

Mark Zuckerberg:
They all basically have the pattern where they want to engage their community, but they're limited by hours in the day and their community generally wants to engage them, but they're limited by hours in the day. So if you could create something where an AI could basically, that creator can basically own the AI and train it in the way that they want and can engage their community, I think that that's going to be super powerful too. So I think that there's going to be a ton of engagement across all these things. But these are just the consumer use cases. I mean, I think when you think about stuff like, I mean, you know, I run like our foundation, right?

Mark Zuckerberg:
A Chan Zuckerberg initiative with my wife, and, you know, we're doing a bunch of stuff on science. And there's obviously a lot of AI work that I think is gonna advance science and healthcare and all these things too. So I think that it's like, there's a, this is, I think, an end up affecting basically every area of the products and the economy.

Dwarkesh Patel:
The thing you mentioned about an AI that can just go out and do something for you, that's multi-step. Is that a bigger model? Is that you'll make like Lama 4 will still, there'll be a version that's still 70B, but will just be, you'll just train it on the right data and that will be super powerful. How do you, like, what does the progression look like? Is it scaling?

Dwarkesh Patel:
Is it just same size, but different banks like you were talking about?

Mark Zuckerberg:
I don't know that we know the answer to that. So I think 1 thing that seems to be a pattern is that you have the llama model, and then you build some kind of other application specific code around it. Right. So some of it is, is the fine tuning for the use case, but some of it is just like logic for, okay, how, like how Meta AI should integrate, like it should work with tools like Google or Bing to bring in real time knowledge. I mean, that's not part of the base llama model.

Mark Zuckerberg:
That's like part of it. Okay. So for llama 2, we had some of that and it was a little more kind of hand engineered. And then part of our goal for Llama 3 was to bring more of that into the model itself. But for Llama 3, as we start getting into more of these agent-like behaviors, I think some of that is going to be more hand-engineered.

Mark Zuckerberg:
And then I think our goal for Llama 4 will be to bring more of that into the model. So I think at each point, like at each step along the way, you kind of have a sense of what's going to be possible on the horizon. You start messing with it and hacking around it. And then I think that that helps you hone your intuition for what you want to try to train into the next version of the model itself. Interesting.

Mark Zuckerberg:
Which makes it more general because obviously anything that you're hand coding is, you know, you can unlock some use cases, but it's just inherently brittle and non-general.

Dwarkesh Patel:
Hey everybody, real quick. I want to tell you about a tool that I wish more applications used. So obviously you've noticed every single company is trying to add an AI chatbot to their website. But as a user, I usually find them really annoying because they give these long, generic, often useless answers. Command Bar is a user assistant that you can just embed into your website or application.

Dwarkesh Patel:
And it feels like you're talking to a friendly human support agent who is browsing with you and for you. And it's much more personalized than a regular chatbot. It can actually look up users' history and respond differently based on that. It can use APIs to perform actions. It can even practically nudge users to explore new features.

Dwarkesh Patel:
1 thing that I think is really cool is that instead of just outputting text, Command Bar can kind of just say, here, let me show you, and start browsing alongside the user. Anyways, they're in a bunch of great products already. You can learn more about them at commandbar.com. Thanks to them for sponsoring this episode. And now back to Mark.

Dwarkesh Patel:
When you say into the model itself, you train it on the thing that you want in the model itself. What do you mean by into the model itself? Well, I

Mark Zuckerberg:
mean, I think like the example that I gave for Llama 2, where, you know, we really, I mean, For Llama 2, the tool use was very specific. Whereas Llama 3 has much better tool use. So we don't have to hand code all the stuff to have it use Google to go do a search. It just kind of can do that. Similarly for coding and running code and just a bunch of stuff like that.

Mark Zuckerberg:
But I think once you get that capability, then you get a peek of, okay, well, what can we start doing next? Okay, well, I don't necessarily want to wait until Llama 4 is around to start building those capabilities, so let's start hacking around it. And so you do a bunch of hand coding and that makes the products better for the interim, but then that also helps show the way of what we want to try to build into the next version of the model.

Dwarkesh Patel:
What is the community fine tune of Llamathur you're most excited by? Maybe not the 1 that will be most useful to you, but just you'll just enjoy playing it with the most. They like fine tune it on antiquity and you'll just be talking to Virgil or something. What are you excited about?

Mark Zuckerberg:
I don't know. I think the nature of the stuff is it's like you get surprised, right? So I think any specific thing that I sort of thought would be valuable, we'd probably be building. But I think you'll get distilled versions. I think you'll get kind of smaller versions.

Mark Zuckerberg:
I mean, 1 thing that I think is, 8 billion I don't think is quite small enough for a bunch of use cases. Right? I think like over time, I'd love to get a billion parameter model or a 2 billion parameter model or even like a, I don't know, maybe like a 500 million parameter model and see what you can do with that. Because I mean, as they start getting, if with 8 billion parameters we're basically nearly as powerful as the largest llama2 model, then with a billion parameters, you should be able to do something that's interesting and faster, good for classification or a lot of basic things that people do before understanding the intent of a user query and feeding it to the most powerful model to hone what the prompt should be. So I don't know.

Mark Zuckerberg:
I think That's 1 thing that maybe the community can help fill in. But I mean, we're also thinking about getting around to distilling some of these ourselves. But right now, the GPUs are training the 405.

Dwarkesh Patel:
So you have all these GPUs. I think you said 350,000 by the end of the year.

Mark Zuckerberg:
That's the whole fleet. I mean, we built 2, I think it's like 22,000, 24,000 clusters that are kind of the single clusters that we have for training the big models. I mean, obviously across a lot of the stuff that we do. A lot of our stuff goes towards training, like Reels models and like Facebook newsfeed and Instagram feed. And then inference is a huge thing for us because we serve a ton of people, right?

Mark Zuckerberg:
So our ratio of inference compute required to training is probably much higher than most other companies that are doing this stuff just because of the sheer volume of the community that we're serving.

Dwarkesh Patel:
Yeah. Yeah. Yeah, that was really interesting in the material they shared with me before that you trained it on more data than is compute optimal just for training because the inference is such a big deal for you guys and also for the community. That it makes sense to just have this thing and have trillions of tokens in there.

Mark Zuckerberg:
Yeah. Yeah. Although, and 1 of the interesting things about it, that we saw even with the 70 billion, is we thought it would get more saturated at, you know, it's like we train it on around 15 trillion tokens. Yeah. I guess our prediction going in was that it was going to asymptote more, but even by the end it was still learning, right?

Mark Zuckerberg:
It's like we probably could have fed it more tokens and it would have gotten somewhat better. But I mean, at some point, you're running a company, you need to do these meta reasoning questions of like, all right, do I wanna spend our GPUs on like training this 70 billion model further? Do we wanna kind of get on with it so we can start testing hypotheses for Lama 4. So we needed to make that call. And I think we got it.

Mark Zuckerberg:
I think we got to a reasonable balance for this version of the 70 billion. There will be others in the future, where the 70 billion multimodal 1, that'll come over the next period. But yeah, I mean, that was fascinating that you could just, that it's the architectures at this point can just take so much data.

Dwarkesh Patel:
Yeah, that's really interesting. So what does this imply about future models? You mentioned that the llama 38B is better than the llama 270 B?

Mark Zuckerberg:
No, it's nearly as

Dwarkesh Patel:
good. Okay, I don't overstep. But does that mean like the llama four-

Mark Zuckerberg:
In the

Dwarkesh Patel:
same order of magnitude. Does that mean like the llama 470 B will be as good as the llama 3, 4 or 5 B? Like, what is the future of this life? This is

Mark Zuckerberg:
1 of the great questions, right? That I think no 1 knows is basically, it's 1 of the trickiest things in the world to plan around is when you have an exponential curve, how long does it keep going for? I think it's likely enough that it will keep going, that it is worth investing the tens or 100 billion plus in building the infrastructure to assume that if that kind of keeps going, you're gonna get some really amazing things that are just gonna make amazing products. But I don't think anyone in the industry can really tell you that it will continue scaling at that rate for sure, right? In general, in history, you hit bottlenecks at certain points and now there's so much energy on this that maybe those bottlenecks get knocked over pretty quickly.

Mark Zuckerberg:
But I don't know. I think that's an interesting question.

Dwarkesh Patel:
What does the world look like where there aren't these bottlenecks? Suppose progress just continues at this pace, which seems plausible. Like zooming out and forgetting about like llamas. There are going

Mark Zuckerberg:
to be different bottlenecks.

Dwarkesh Patel:
Right. So if not training, then like, oh yeah, go

Mark Zuckerberg:
ahead. Well, I think at some point, over the last few years, I think there was this issue of GPU production. Yeah. Right. So even companies that had the models, sorry, that had the money to pay for the GPUs couldn't necessarily get as many as they wanted because there were all these supply constraints.

Mark Zuckerberg:
Now I think that's sort of getting less. So now I think you're seeing a bunch of companies think about, wow, we should just like really invest a lot of money in building out these things. And I think that will go for some period of time. I think there is a capital question of like, okay, at what point does it stop being worth it to put the capital in? But I actually think before we hit that, you're gonna run into energy constraints, right?

Mark Zuckerberg:
Because I just, I mean, I don't think anyone's built a gigawatt single training cluster yet, right? And then you run into these things that just end up being slower in the world. Like getting energy permitted is like a very heavily regulated government function. Right. So you're going from on the 1 hand software, which is somewhat regulated.

Mark Zuckerberg:
I'd argue that it is more regulated than I think a lot of people in the tech community feel, although it's obviously different. If you're starting a small company, maybe you feel that less. If you're a big company, you know, We just interact with people, but different governments and regulators, we have lots of rules that we need to follow and make sure we do a good job with around the world. But I think that there's no doubt that energy, and if you're talking about building large new power plants or large buildouts and then building transmission lines that cross other private or public land. That is just a heavily regulated thing.

Mark Zuckerberg:
So you're talking about many years of lead time. So if we wanted to stand up just some massive facility to power that, I think that that's a very long-term project. I think people will do it, But I don't think that this is like, something that can be quite as magical as just like, okay, you get a level of AI and you get a bunch of capital and you put it in and then like, all of a sudden, the models are just going to kind of like, it just like, I think you do hit different bottlenecks along the way.

Dwarkesh Patel:
Yeah. Is there something, a project, maybe I related it, maybe not, that even a company like Meta doesn't have the resources for it. Like if your R&D budget or your CapEx budget was 10x what it is now, then you could pursue it. Like it's in the back of your mind, but Meta today, and maybe you could like, even you can't even issue a stock or bond for it, it's like just 10x bigger than

Mark Zuckerberg:
your budget. Well, I think energy is 1 piece. Right? I think we would probably build out bigger clusters than we currently can, if we could get the energy to do it. So I

Dwarkesh Patel:
think that that's fundamentally money bottlenecked in the limit, like if you had a trillion dollars.

Mark Zuckerberg:
I think it's time. Well, if you look at it in terms of, but it depends on how far the exponential curves go. I think a number of companies are working on, you know, right now I think a lot of data centers are on the order of 50 megawatts or 100 megawatts, or like a big 1 might be 150 megawatts. Okay. So you take a whole data center and you fill it up with just all the stuff that you need to do for training and you build the biggest cluster you can.

Mark Zuckerberg:
I think that's kind of, I think a bunch of companies are running at stuff like that. But then when you start getting into building a data center that's like 300 megawatts or 500 megawatts or a gigawatt, I mean, just no 1 has built a single gigawatt data center yet. So I think it will happen. I mean, this is only a matter of time, but it's not going to be like next year. I think that some of these things will take, I don't know, some number of years to build out.

Mark Zuckerberg:
And then the question is, okay, well, if you, I mean, just to, I guess, put this in perspective, I think a gigawatt, it's like around the size of like a meaningful nuclear power plant only going towards training a model.

Dwarkesh Patel:
Didn't Amazon do this? They have a 950 megawatt thing.

Mark Zuckerberg:
Yeah, I'm not exactly sure what you did. What they did, you'd have to ask them.

Dwarkesh Patel:
But it doesn't have to be in the same place, right? If distributed training works, it can be distributed. That I think is a big question.

Mark Zuckerberg:
It's basically how that's going to work. And I do think in the future, it seems quite possible that more of what we call training for these big models is actually more along the lines of inference generating synthetic data to then go feed into the model. So I don't know what that ratio is going to be, but I consider the generation of synthetic data to be more inference than training today. But Obviously, if you're doing it in order to train a model, it's part of the broader training process. So I don't know.

Mark Zuckerberg:
That's an open question, is to kind of where what the balance of that and how that plays out.

Dwarkesh Patel:
If that's the case, would that potentially also be the case with Lama 3? And maybe like Lama 4 onwards, where you put this out and if somebody has a ton of compute then using the models that you've put out you can just keep making these things arbitrarily smarter. Like some Kuwait or UAE or some random country has a ton of compute And they can just actually just use Lama 4 to just make something much smarter.

Mark Zuckerberg:
I do think that there are going to be dynamics like that. But I also think that there is a fundamental limitation on the network architecture, or the model architecture. So I think a $70 billion model that we trained with the Llama 3 architecture can get better. It can keep going. Like I was saying, we felt like if we kept on feeding it more data or rotated the high value tokens through again, then it would continue getting better.

Mark Zuckerberg:
And we've seen a bunch of other people around the world, different companies basically take the llama 2, 70 billion base, take that model architecture and then build a new model. It's still the case that when you make a generational improvement to the llama 3, 70 billion or the llama 3, 405, there's nothing open source, anything like that today. I think that it's a big step function and what people are going to be able to build on top of that I don't think can go infinitely from there. I think there can be some optimization in that until you get to the next step function.

Dwarkesh Patel:
Yeah. OK, so let's zoom out a little bit from specific models and even the many years lead times you would need to get energy approvals and so on. Big picture, these next couple of decades. What's happening with AI? Does it feel like another technology like metaverse or social, or does it feel like a fundamentally different thing in the course of human history?

Mark Zuckerberg:
I think it's going to be pretty fundamental. I think it's going to be more like the creation of computing in the first place. Right? So you'll get all these new apps in the same way that when you got the web or you got mobile phones, you got like people basically rethought all these experiences and a lot of things that weren't possible before now became possible. Something that will happen, but I think it's a much lower level innovation.

Mark Zuckerberg:
It's gonna be more like going from people didn't have computers to people have computers is my sense. But it's also, it's,

Dwarkesh Patel:
I don't

Mark Zuckerberg:
know, it's very hard to reason about exactly how this goes. I tend to think that, you know, in like the cosmic scale, obviously it'll happen quickly over a couple of decades or something. But I do think that there is some set of people who are afraid of like, you know, it really just kind of spins and goes from being like somewhat intelligent to extremely intelligent overnight. I just think that there's all these physical constraints that make that so that that's unlikely to happen. I just don't really see that playing out.

Mark Zuckerberg:
I think we'll have time to acclimate a bit, but it will really change the way that we work and give people all these creative tools to do different things that they... Yeah, I think it's going to really enable people to do the things that they want a lot more, is my view.

Dwarkesh Patel:
Okay, so maybe not overnight, but is it your view that on a cosmic scale, if you think humans evolved and then AI happened and then they went out through the galaxy? Or maybe it takes many decades, maybe it takes a century, but is that the grand scheme of what's happening right now in history?

Mark Zuckerberg:
Sorry, in what sense? I mean,

Dwarkesh Patel:
in the sense that there are other technologies like computers and even like fire, but like the AI happening is as significant as like humans evolving in the first place?

Mark Zuckerberg:
I think that's tricky. I think people like to, I mean, the history of humanity, I think has been people basically thinking that certain aspects of humanity are like really unique in different ways. And then coming to grips with the fact that that's not true but humanity is actually still super special, right? So It's like we thought that the Earth was the center of the universe and it's like, it's not, but like humans are still pretty awesome, right? And pretty unique.

Mark Zuckerberg:
I think that another bias that people tend to have is thinking that intelligence is somehow kind of fundamentally connected to life. And it's not actually clear that it is, right? I think like, like people think that, I mean, I don't know that we have a clear enough definition of consciousness or life to kind of fully interrogate this, but I know there's all the science fiction about, okay, you create intelligence And now it like starts taking on all these human like behaviors and, and things like that. But I actually think that the current incarnation of all this stuff, at least kind of feels like it's going in a direction where intelligence can be pretty separated from consciousness and agency and things like that, that I think just makes it a super valuable tool. So, I don't know.

Mark Zuckerberg:
I mean, obviously, it's very difficult to predict what direction the stuff goes in over time, which is why I don't think anyone should be dogmatic about how they plan to develop it or what they plan to do. I think you want to kind of look at each release. You know, it's like we're obviously very pro open source. Yeah. But I haven't committed that we're going to release every single thing that we do.

Mark Zuckerberg:
But it's basically, I'm just generally very inclined to thinking that open sourcing it is going to be good for the community and also good for us, right? Because we'll benefit from the innovations. But if at some point there's some qualitative change in what the thing is capable of, and we feel like it's just not responsible to open source it, then we won't. But, so I don't know. It's all very difficult to predict.

Dwarkesh Patel:
Yeah. What is a kind of qualitative change, like a specific thing you're training LamaFi, Lama4, and you've seen this and like, you know what, I'm not sure about open sourcing it.

Mark Zuckerberg:
I think that that it's a little hard to answer that in the abstract because there are negative behaviors that any product can exhibit that as long as you can mitigate it, it's like, it's okay. Right? So, I mean, there's bad things about social media that we work to mitigate, right. There's bad things about llama 2 that we spend a lot of time trying to make sure that it's not like, you know, helping people commit violent acts or things like that. Right.

Mark Zuckerberg:
I mean, that doesn't mean that it's like a kind of autonomous or intelligent agent. It just means that it's learned a lot about the world and it can answer a set of questions that we think it would be unhelpful for it to answer. So I don't know. I think the question isn't really what behaviors would it show. It's what things would we not be able to mitigate after it shows that.

Mark Zuckerberg:
And I don't know. I think that There's so many ways in which something can be good or bad that it's hard to actually enumerate them all up front. If you even look at what we've had to deal with in social media and the different types of harms, we've basically gotten to, there's like 18 or 19 categories of harmful things that people do. And we've basically built AI systems to try to go identify what those things are that people are doing and try to make sure that that doesn't happen on our network as much as possible. So yeah, I think you can, over time, I think you'll be able to break down this into more of a taxonomy too.

Mark Zuckerberg:
And I think this is a thing that we spend time researching too because we want to make sure that we understand that.

Dwarkesh Patel:
So 1 of the things I asked Mark is what industrial scale use of LLMs would look like? You see this in previous technological revolutions where at first they're thinking in a very small scale way about what's enabled. And I think that's what chat bots might be for other alums. And I think the large scale use case might look something like what V7 Go is. And by the way, it's made by V7 Labs who's sponsoring this episode.

Dwarkesh Patel:
So it's like a spreadsheet. You put in raw information like documents, images, whatever, and they become rows and the columns are populated by an LLM of your choice. And in fact, I used it to prepare for Mark. So I fed in a bunch of blog posts and papers from Meta's AI research. And as you can see, if you're on YouTube, it summarizes and extracts exactly the information I want as columns.

Dwarkesh Patel:
And obviously mine is a small use case, but you can imagine, for example, a company like FedEx has to process half a million documents a day. Obviously a chatbot can't do that. A spreadsheet can because this is just like a fire hose of intelligence in there, right? Anyways, you can learn more about them at v7labs.com slash go or the link in the description. Back to Mark.

Dwarkesh Patel:
Yeah, like it seems to me it would be a good idea I would be disappointed in a future where AI systems aren't broadly deployed and everybody doesn't have access to them Yeah, at the same time. I want to better understand the mitigations Yeah, because if the mitigation is the fine-tuning well the whole thing about open weights is that you can then remove the fine tuning, which is often superficial on top of these capabilities. If it's like talking on Slack with a biology researcher, and again, I think models are very far from this. Right now they're like Google search, but I can show them my petri dish and they can explain like, here's why you're a smallpox sample didn't grow. Here's what to change.

Dwarkesh Patel:
How do you mitigate that? Because somebody can just like fine tune that in there, right?

Mark Zuckerberg:
Yeah, I mean, that's true. I think a lot of people will basically use the off the shelf model. And some people who have basically bad faith are going to try to strip out all the bad stuff. So I do think that's an issue. The flip side of this is that, and this is 1 of the reasons why I'm kind of philosophically so pro open source is I do think that a concentration of AI in the future has the potential to be as dangerous as kind of it being widespread.

Mark Zuckerberg:
So I think a lot of people are, they think about the questions of, okay, well, if we can do this stuff, is it bad for it to be out wild? Like just in kind of widely available. I think another version of this is like, okay, well, it's probably also pretty bad for 1 institution to have an AI that is way more powerful than everyone else's AI. Right. So if you look at like, like, I guess 1 security analogy that I think of is, it doesn't take AI to basically, okay, there's security holes and so many different things.

Mark Zuckerberg:
And if you could travel back in time a year or 2 years, right? It's like, that's not AI. It's like, let's say you just have like 1 year or 2 years more knowledge of the security holes. It's pretty much hack into like any system, right? So it's not that far-fetched to believe that a very intelligent AI would probably be able to identify some holes and basically be like a human who could potentially go back in time a year or 2 and compromise all these systems.

Mark Zuckerberg:
Okay, so how have we dealt with that as a society? Well, 1 big part is open source software that makes it so that when improvements are made to the software, it doesn't just kind of get stuck in 1 company's products, but it can kind of be broadly deployed to a lot of different systems, whether it's banks or hospitals or government stuff. And like, just everyone can kind of, like as the software gets hardened, which happens because more people can see it and more people can bang on it. And there are standards on how this stuff works. The world can get upgraded together pretty quickly.

Mark Zuckerberg:
And I think that a world where AI is very widely deployed in a way where it's gotten hardened progressively over time. And is 1 where all the different systems will be in check in a way that seems like it is fundamentally more healthy to me than 1 where this is more concentrated. So there are risks on all sides, but I think that that's 1 risk that I think people, I don't hear them talking about quite as much. I think like there's sort of the risk of like, okay, well, what if the AI system does something bad? I am more like, you know, I stay up at night more worrying, well, what if like, some actor that, whatever.

Mark Zuckerberg:
It's like from wherever you sit, there's gonna be some actor who you don't trust. If they're the ones who have like the super strong AI, whether it's some like, other government that we, that is sort of like an opponent of our country or some company that you don't trust or whatever it is. Like, I think that that's potentially

Dwarkesh Patel:
a much bigger risk. As in they could like overthrow our government because they have a weapon that nobody else has?

Mark Zuckerberg:
It's caused a lot of mayhem. I think the intuition is that this stuff ends up being pretty important and valuable for both economic and security and other things. I don't know. I just think, yeah, if someone who you don't trust or is an adversary of you gets something that is more powerful, then I think that that could be an issue. And I think probably the best way to mitigate that is to have good open source AI that basically becomes the standard and in a lot of ways can become the leader.

Mark Zuckerberg:
And in that way, it just ensures that it's a much more even and balanced playing field.

Dwarkesh Patel:
Yeah. That seems plausible to me. And if that works out, that would be the future I prefer. I guess I want to understand, like, mechanistically, how if somebody was gonna cause mayhem with AI systems, how the fact that there are other open-source systems in the world prevents that. Like, the specific example of, like, somebody coming with a bio weapon, Is it just that we'll do a bunch of like R&D in the rest of the world to like figure out vaccines really fast?

Dwarkesh Patel:
Like what's happening? If you

Mark Zuckerberg:
take like the computer, the security 1 that I was talking about, I think someone with a weaker AI trying to hack into a system that is like protected by a stronger AI will succeed less. Right. So, so I think that that's, I mean, that's like in terms of software,

Dwarkesh Patel:
how do you know everything in the world is like that? Like, what if bio weapons aren't like that?

Mark Zuckerberg:
No, I mean, I don't know that everything in the world is like that. I think that that's, I guess, 1 of the bioweapons are 1 of the areas where I think the people who are most worried about this stuff are focused. And I think that that's, I think that makes a lot of sense to think about that. The, I think that there are certain mitigations, you can try to not train certain knowledge into the model, right? There's different things, But yeah, I mean, it's some level, if you get a sufficiently bad actor and you don't have other AI that can sort of balance them and understand what's going on and what the threats are, then that could be a risk.

Mark Zuckerberg:
So I think that's 1 of the things that we need to watch out for.

Dwarkesh Patel:
Is there something you could see in the deployment of these systems where you observe like you're turning Lama 4 and it's like, lie to you because you thought you were noticing or something. And you're like, whoa, what's going on here? This is probably not likely with the Lomophorotide system, but is there something you can imagine like that where you'd be really concerned about deceptiveness and if billions of copies of things are out in the wild?

Mark Zuckerberg:
Yeah, I mean, I think that that's not necessarily, I mean, right now it's we see a lot of hallucinations, right? So I think it's more more that, I think it's an, it's an interesting question how you would tell the difference between a hallucination and deception. But yeah, I mean, look, I mean, I think there's a lot of risks and things to think about. The flip side of all this is that there are also a lot of, I try to, in running our company, at least, balance what I think of as these longer term theoretical risks with what I actually think are quite real risks that exist today. So like when you talk about deception, the form of that that I worry about most is people using this to generate misinformation and then like pump that through whether it's our networks or others.

Mark Zuckerberg:
So the way that we've basically combated a lot of this type of harmful content is by building AI systems that are smarter than the adversarial ones. And this is part of this kind of informs part of my theory on this, right? Is if you look at like the different types of harm that people do, or try to do through social networks, there are ones that are not very adversarial. So for example, like hate speech, I would say is not super adversarial in the sense that like, people aren't getting better at being racist, right? They're just like, it's you just like, okay, if you kind of that's 1 where I think the AIs are generally just getting way more sophisticated faster than people are at those issues.

Mark Zuckerberg:
So we have, and we have issues both ways. It's like, people do bad things that whether they're trying to incite violence or something. But we also have a lot of false positives, right? So where we basically censor stuff that we shouldn't and I think understandably make a lot of people annoyed. So I think having an AI that just gets increasingly precise on that, that's going to be good over time.

Mark Zuckerberg:
But let me give you another example, which is like nation states trying to interfere in elections. That's an example where they're absolutely they have cutting edge technology and absolutely get better each year. So we block some technique, they learn what we did, they come at us with a different technique, right? It's not like a person trying to, you know, I don't know, say mean things, right? It's like, it's, it's, they're, they're basically, they have a goal, they're sophisticated, they have a lot of technology.

Mark Zuckerberg:
In those cases, I still think the ability to kind of have our AI systems grow and in sophistication at a faster rate than theirs have. It's an arms race, but I think we're at least currently winning that arms race. So I don't know. I think that that's, but this is like a lot of the stuff that I, that I spend time thinking about is like, okay, yes, it is possible that whether it's llama 4 or llama 5 or llama 6, yeah, we need to think about like what behaviors we're observing. And it's not just us.

Mark Zuckerberg:
I think part of the reason why you make this open source is that there are a lot of other people who study this too. So yeah, we want to see what other people are observing, what we're observing, what we can mitigate, and then we'll make our assessment on whether we can make it open source. But I think for the foreseeable future, I'm optimistic we will be able to. And in the near term, I don't want to take our eye off the ball of what our actual bad things that people are trying to use the models for today, even if they're not existential, but they're like pretty bad kind of day-to-day harms that we're familiar with and running our services, that's actually a lot of what we have to, I think, spend our time on as well.

Dwarkesh Patel:
Yeah. Actually, I found the synthetic data thing really curious. I'm actually interested in why you don't think, Like current models, it makes sense why there might be an asymptote with just doing the synthetic data again and again. If they get smarter and you use the kind of techniques you talk about in the paper or the blog post that's coming out on the day this will be released where it goes to the thought chain that is the most correct. Why this wouldn't like lead to a loop that, of course it wouldn't be overnight, but over many months or years of training potentially with a smarter model, it gets smarter, makes better output, gets smarter and so forth.

Mark Zuckerberg:
Well, I think it could within the parameter of whatever the model architecture is. It's just that like at some level, I don't know, I think like today's 8 billion parameter models, I just don't think you're going to be able to get to be as good as the state of the art, multi-hundred billion parameter models that are incorporating new research into the architecture itself.

Dwarkesh Patel:
But those will be open source as well, right?

Mark Zuckerberg:
Well, Yeah, but I think that that's subject to all the questions that we just talked about. But yes, I mean, we would hope that that'll be the case. But I think that at each point, I don't know, it's like when you're building software, there's like a ton of stuff that you can do with software, but then at some level, you're constrained by the chips that it's running on. Right. So there are always going to be different physical constraints.

Mark Zuckerberg:
And it's like how bigger the models is going to be constrained by how much energy you can get and use for inference. So I guess I'm simultaneously very optimistic that this stuff will continue to improve quickly and also a little more measured than I think some people are about... Kind of it's... I just don't think the runaway case is a particularly likely 1.

Dwarkesh Patel:
I think it makes sense to keep your options open. Like, there's so much we don't know. There's a case in which, like, it's really important to keep the balance of power so when nobody becomes, like, a totalitarian dictator, there's a case in which, like, you don't want to open source the architecture because, like, China can use it to catch up to America's AIs. And there is an intelligence explosion, and they win that. A lot of things seem impossible, just keeping your options open, considering all of them seems reasonable.

Dwarkesh Patel:
Yeah. Let's talk about some other things. Go for it. Okay. Metaverse, what time period in human history would you be most interested in going into?

Dwarkesh Patel:
100,000 BCE to now. You just want to see what it was like.

Mark Zuckerberg:
Well, it has to be the past.

Dwarkesh Patel:
Huh? It has to be the past? Oh, yeah, it

Mark Zuckerberg:
has to be the past. I don't know. I mean, I have the periods of time that I'm interested. I mean, I'm really interested in American history and classical history. And I'm really interested in the history of science too.

Mark Zuckerberg:
So I actually think seeing and trying to understand more about how some of the big advances came about. I mean, all we have are somewhat limited writings about some of that stuff. I'm not sure the metaverse is going to let you do that because it's going to be hard to go back in time for things that we don't have records of. But I'm actually not sure that going back in time is going to be that important of a thing for them. I mean, I think it's going to be cool for history classes and stuff, but That's probably not the use case that I'm most excited about for the metaverse overall.

Mark Zuckerberg:
The main thing is just the ability to feel present with people no matter where you are. I think that's going to be killer. In the AI conversation that we're having, so much of it is about physical constraints that kind of underlie all of this, right? And you want to move, I think 1 lesson of technology is you want to move things from the physical constraint realm into software as much as possible because software is so much easier to build and evolve. And like you can democratize it more because like not everyone is going to have a data center, but like a lot of people can kind of write code and take open source code and modify it.

Mark Zuckerberg:
The metaverse version of this is I think enabling realistic digital presence is going to be just an absolutely huge difference for making it so that people don't feel like they have to physically be together for as many things. Now, I mean, I think that there are going to be things that are better about being physically together. So it's not, I mean, these things aren't binary. It's not going to be like, okay, now it's you don't need to do that anymore. But, but overall, I mean, I think that this it's just gonna be really powerful for socializing for feeling connected with people for working for, I don't know, parts of industry for medicine, for like so many things.

Dwarkesh Patel:
I want to go back to something you said at the beginning of the conversation where you didn't sell the company for a billion dollars and like the metaverse, you knew we were going to do this even though the market was hammering you for it. And then I'm actually curious, like what is the source of that edge? And you said, like, oh, values, I have this intuition, but, like, everybody says that, right? Like, if you had to say something that's specific to you, what is... How would you express what that is?

Dwarkesh Patel:
Like, why were you so convinced about the metaverse?

Mark Zuckerberg:
Well, I think that those are different questions. So what are the things that kind of power me? I think we've talked about a bunch of things. So I just really like building things. I specifically like building things around how people communicate and sort of understanding how people express themselves and how people work.

Mark Zuckerberg:
Right. I said, when I was in college, I was, I was, I said computer science and psychology. I think a lot of other people in the industry studied computer science, right? So it's always been sort of the intersection of those 2 things for me. But I think it's also sort of this like really deep drive.

Mark Zuckerberg:
I don't know how to explain it, but I just feel like in the constitutionally, like I'm doing something wrong if I'm not building something new. Right. And so I think that there's like, you know, even when we're putting together the business case for investing like a hundred billion dollars in AI or some huge amount in the metaverse, it's like, yeah, we have plans that I think make it pretty clear that if our stuff works, it'll be a good investment. But like, you can't know for certain from the outset. And there's all these arguments that people have, you know, whether it's like, you know, with advisors or different folks, it's like, well, how could you, like, how are you confident enough to do this?

Mark Zuckerberg:
And it's like, well, the day I stop trying to build new things, I'm just done. I'm gonna go build new things somewhere else, right? It's like, I'm fundamentally incapable of running something or in my own life and like, not trying to build new things that I think are interesting. It's like, that's not even a question for me, right? It's like whether, like whether we're going to go take a swing at building the next thing.

Mark Zuckerberg:
I'm just incapable of not doing that. And I don't know. I'm kind of like this in all the different aspects of my life. Right. It's like we built this like family built this ranch and kawaii and like, I just like worked like design all these buildings.

Mark Zuckerberg:
I'm like, kind of trying to like, we started raising cattle And I'm like, all right, well, I want to make like the best cattle in the world. Right. So it's like, how do we, like, how do we architect this so that way we can figure this out and like in build and call the stuff out that we need to, to try to do that. So I don't know. That's me.

Mark Zuckerberg:
What was the other part of the question?

Dwarkesh Patel:
Look, Meta is just a really amazing tech company, right? They have all these great software engineers and even they work with Stripe to handle payments. And I think that's just a really notable fact that Stripe's ability to engineer these checkout experiences is so good that big companies like Ford, Zoom, Meta, even OpenAI, they work with Stripe to handle payments. Because just think about how many different possibilities you have to handle. If you're in a different country, you'll pay a different way.

Dwarkesh Patel:
And if you're buying a certain kind of item, that might affect how you decide to pay. And Stripe is able to test these fine grained optimizations across tens of billions of transactions a day to figure out what will convert people. And obviously conversion means more revenue for you. And look, I'm not a big company like Meta or anything, but I've been using Stripe since long before they were advertisers. Stripe Atlas was just the easiest way for me to set up an LLC.

Dwarkesh Patel:
And they have these payments and invoicing features that make it super convenient for me to get money from advertisers. And obviously without that, it would have been much harder for me to earn money from the podcast. And so it's been great for me. Go to stripe.com to learn more. Thanks to them for sponsoring the episode.

Dwarkesh Patel:
Now back to Mark. I'm not sure, but I'm actually curious about something else, which is so the 19 year old Mark reads a bunch of like antiquity and classics, high school, college. What important lesson did you learn from it? Not just interesting things you found, but like there aren't that many tokens you've consumed by the time you're 19. A bunch of them were about the classics.

Dwarkesh Patel:
Clearly that was important in some way. And that many

Mark Zuckerberg:
tokens you've consumed. I don't know, that's a good question. I mean, 1 of the things that I thought was really fascinating is, so when Augustus was first, so he became emperor, and he was trying to establish peace. And there was no real conception of peace at the time. Like the people's understanding of peace was, it is the temporary time between when your enemies will inevitably attack you again, so you get like a short rest.

Mark Zuckerberg:
And he had this view, which is like, look, we want to change the economy from instead of being so mercenary and like, and kind of militaristic to like actually this positive something. It's like a very novel idea at the time. I don't know. I think that there's something that's just really fundamental about that. It's like in terms of the bounds on what people can conceive at the time of what are rational ways to work.

Mark Zuckerberg:
And I'm going back to like, and this applies to both the metaverse and the AI stuff, but like a lot of investors and just different people just can't wrap their head around why we would open source this. And it's like, are you like, like, I don't understand. It's like open source that must just be like the temporary time between which you're making things proprietary. Right. And it's, but I actually think it's like this very profound thing in tech that has actually, it creates a lot of winners.

Mark Zuckerberg:
Right. And it's, and so I don't know, I don't want to strain the analogy too much, but, but I do think that there's, there's a lot of times, I think ways where you can, that are just like models for building things that people can't even, like, they just, like, often can't wrap their head around how that would be a valuable thing for people to go do, or like a reasonable state of the world, that it's, I mean, it's, I think that there's more reasonable things than people think.

Dwarkesh Patel:
That's super fascinating. Can I give you my answer to what I was thinking? Sure. You might've gotten from it. This is probably totally off, but just how young some of these people are who have very important roles in the empire.

Dwarkesh Patel:
Like, Caesar Augustus, like, by the time he's 19, he's actually incredibly 1 of the most prominent people in Roman politics. And he's, like, leading battles and forming the second prime memberate. I wonder if you're like, the 19-year-old is like, I can actually do this, because Caesar Augustus did this.

Mark Zuckerberg:
I think that's an interesting example, both from a lot of history and American history. Yeah. 1 of my favorite quotes is this Picasso quote that all children are artists and the challenge is how do you remain an artist when you grow up? Basically, I think because when you're younger, I think it's just easier to have kind of wild ideas. And you're not, you know, you have no, there are all these analogies to the innovator's dilemma that exist in your life, as well as your company or whatever you've built.

Mark Zuckerberg:
Right. So, you're kind of earlier on your trajectory, it's easier to pivot and take in new ideas without disrupting other commitments that you've made to different things. And So I don't know. I think that's an interesting part of running a company, is how do you stay dynamic?

Dwarkesh Patel:
Going back to the investors in open source, the $10 billion model, suppose it's totally safe you've done these evaluations, And unlike in this case, the evaluators can also fine tune the model, which hopefully will be the case in future models. Would you open source that, the $10 billion model?

Mark Zuckerberg:
Well, I mean, as long as it's helping us, then yeah.

Dwarkesh Patel:
But would it like the $10 billion of R&D and then now it's like open source for anybody?

Mark Zuckerberg:
Well, I think here's I think a question which we'll have to evaluate this as time goes on too. But we have a long history of open sourcing software, right? We don't tend to open source our product, right? So it's not like we don't take like the code for Instagram and make it open source, but we take like a lot of the low level infrastructure and we make that open source. Right.

Mark Zuckerberg:
The probably the biggest 1 in our history was open compute project, where we took the designs for kind of all of our servers, network switches and data centers and made it open source and ended up being super helpful because I mean, a lot of people can design servers, but now like the industry standardized on our design, which meant that the supply chains basically all got built out around our design. The volumes went up, so it got cheaper for everyone and saved us billions of dollars. So awesome, right? Okay, so there's multiple ways where open source I think could be helpful for us. 1 is if people figure out how to run the models more cheaply, well, we're gonna be spending tens or like a hundred billion dollars or more over time on all this stuff.

Mark Zuckerberg:
So if we can do that 10% more effectively, we're saving billions or tens of billions of dollars. Okay, that's probably worth a lot by itself. Especially if there's other competitive models out there. It's not like our thing is like, be giving away some kind of crazy advantage.

Dwarkesh Patel:
So is your view that the trading will be commodified?

Mark Zuckerberg:
I think there's a bunch of ways that this could play out. That's 1. The other is that so commodity kind of implies that it's going to get very cheap because there's lots of options. The other direction that this could go in is qualitative improvements. So you mentioned fine tuning, right?

Mark Zuckerberg:
It's like right now it's pretty limited what you can do with fine tuning major other models out there. And there are some options, but generally not for the biggest models. So I think being able to do that and be able to kind of do different app specific things or use case specific things or build them into specific tool chains, I think will not only enable kind of more efficient development, it could enable qualitatively different things. Here's 1 analogy on this. So 1 thing that I think generally sucks about the mobile ecosystem is that you have these 2 gatekeeper companies, Apple and Google, that can tell you what you're allowed to build.

Mark Zuckerberg:
And there are lots of times in our history, so there's the economic version of that, which is like, all right, we build something there just like, I'm going to take a bunch of your money. But then there's the qualitative version, which is actually what kind of upsets me more, which is there's a bunch of times when we've launched or wanted to launch features and then Apple's just like, nope, you're not launching that. It's like, that sucks, right? And so the question is, what is it like, Are we kind of set up for a world like that with AI, where you're going to get a handful of companies that run these closed models that are going to be in control of the APIs and therefore going to be able to tell you what you can build? Well, for 1, I can say For us, it is worth it to go build a model ourselves to make sure that we're not in that position.

Mark Zuckerberg:
Like I don't want any of those other companies telling us what we can build. But from an open source perspective, I think a lot of developers don't want those companies telling them what they can build either. So the question is, what is the ecosystem that gets built out around that? What are interesting new things? How much does that improve our products?

Mark Zuckerberg:
I think that there's a lot of cases where if this ends up being like our databases or caching systems or architecture, we'll get valuable contributions from the community that'll make our stuff better. And then our app specific work that we do will still be so differentiated that it won't really matter. It's like, we'll be able to do what we do, we'll benefit in all the systems, ours and the communities will be better because it's open source. There is 1 world where maybe it's not that. I mean, maybe the model just ends up being more of the product itself.

Mark Zuckerberg:
In that case, then I think it's a trickier economic calculation about whether you open source that because then you are kind of commoditizing yourself a lot. But I don't, from what I can see so far, it doesn't seem like we're in that zone.

Dwarkesh Patel:
Do you expect to earn significant revenue from licensing your model to the cloud providers so they have to pay you a fee to actually serve the model?

Mark Zuckerberg:
We want to have an arrangement like that, but I don't know how significant it'll be. And we have this, this is basically our license for Lama. In a lot of ways, it's like a very permissive open source license, except that we have a limit for the largest companies using it. And this is why we put that limit in, is we're not trying to prevent them from using it. We just want them to come talk to us because if they're gonna just basically take what we built and resell it and make money off of it, then it's like, okay, well, if you're like, you know, Microsoft Azure or Amazon, then yeah, if you're going to reselling the model, then we should have some revenue share on that.

Mark Zuckerberg:
So just come talk to us before you go do that. And that's how that's played out. So for Lama 2, it's, I mean, we basically just have deals with all these major cloud companies and Lama 2 is available as a hosted service on all those clouds. And I assume that as we release bigger and bigger models, that'll become a bigger thing. It's not the main thing that we're doing, but I just think if others are...

Mark Zuckerberg:
If those companies are going to be selling our models, it makes sense that we should share the upside of that somehow.

Dwarkesh Patel:
Yeah. With regards to the other open source dangers, I think I have a genuine legitimate point about the balance of power stuff and potentially like the harms you can get rid of because we have better alignment techniques or something. I wish there was some sort of framework that Meta had, like other labs have this where they say like, if we see this concrete thing, then that's a no go on the open source or like even potential deployment. Just like writing it down to like, the company is ready for it. People have expectations around it and so forth.

Mark Zuckerberg:
Yeah. No, I think that that's a fair point on the existential risk side. Right now, we focus more on the types of risks that we see today, which are more of these content risks. So, you know, we have lines on, we don't want the model to be basically doing things that are helping people commit violence or fraud, or, you know, just harming people in different ways. So in practice for today's models, and I would guess the next generation, and maybe even the generation after that, I think while it is somewhat more maybe intellectually interesting to talk about the the existential risks, I actually think the real harms that need more energy being mitigated are things that are going to have someone take a model and do something to hurt a person with today's parameters and the types of more mundane harms that we see today, like people committing fraud against each other, things like that.

Mark Zuckerberg:
So I just don't want to shortchange that. I think we have a responsibility to make sure we do a good job on that.

Dwarkesh Patel:
Yeah, Meta's a big company, you can handle

Mark Zuckerberg:
both. Yeah.

Dwarkesh Patel:
Okay, so as far as the open source goes, I'm actually curious if you think the impact of the open source from PyTorch, React, Open Compute, these things, has been bigger for the world than even the social media aspects of meta. Because I've talked to people who use these services would think it's plausible. Because a big part of the internet runs on these things.

Mark Zuckerberg:
It's an interesting question. I think almost half the world uses our...

Dwarkesh Patel:
Yeah, that's a true point.

Mark Zuckerberg:
So I think it's hard to beat that. But now I think open sources, it's really powerful as a new way of building things. And yeah, I mean, it's possible. I mean, it's maybe 1 of these things where, I don't know, like Bell Labs, right? Where they, you know, it's like they were working on the transistor because they wanted to enable long distance calling.

Mark Zuckerberg:
And they did. And it ended up being really profitable for them that they were able to enable long distance calling. And if you ask them 5 to 10 years out from that, what was the most useful thing that they invented? It's a cookie. Well, we enabled long distance calling.

Mark Zuckerberg:
And now all these people are long distance calling. But if you ask 100 years later, maybe it's a different question. So I think that that's true of a lot of the things that we're building, right? Reality labs, some of the AI stuff, some of the open source stuff. I think it's like the specific products evolve and to some degree come and go.

Mark Zuckerberg:
But I think the advances for humanity persist. And that's a cool part of what we all get to do.

Dwarkesh Patel:
By when will the Lama models be trained on your own custom silicon?

Mark Zuckerberg:
Soon, not llama for. The approach that we took is first, we basically built custom silicon that could handle inference for our ranking and recommendation type stuff. So reels, newsfeed, ads, and that was consuming a lot of GPUs. But when we were able to move that to our own silicon, we now were able to use the more expensive Nvidia GPUs only for training. So at some point, we will hopefully have silicon ourselves that we can be using for probably first training some of the simpler things, then eventually training these like really large models.

Mark Zuckerberg:
But in the meantime, I'd say the program is going quite well, and we're just rolling it out methodically and have a long-term roadmap for it.

Dwarkesh Patel:
Final question. This is totally out of the left field, but if you were made CEO of Google+, could you have made it work?

Mark Zuckerberg:
Google+, oof. Well, I don't know. I don't know. That's a very difficult, very difficult counterfactual.

Dwarkesh Patel:
Okay, then the real final question will be when Gemini was launched, did you, was there any chance that somebody in the office uttered Karthika Delinda Est? No, I think

Mark Zuckerberg:
we're tamer now.

Dwarkesh Patel:
Cool, cool. I was in my...

Mark Zuckerberg:
Yeah, I don't know, it's a good question. I don't know. The problem is there was no CEO of Google+. It was just like a division within a company. I think it's like, And you asked before about what are the kind of scarcest commodities, but you asked about it in terms of dollars.

Mark Zuckerberg:
And I actually think for most companies, it's of this scale, at least, it's focus, right? It's like when you're a startup, maybe you're more constrained on capital. You know, you just are working on 1 idea and you might not have all the resources. I think you cross some threshold at some point where the nature of what you're doing, you're building multiple things and you're creating more value across them, but you become more constrained on what can you direct and to go well. And like, there's always the cases where something just random awesome happens in the organization.

Mark Zuckerberg:
I don't even know about it. And those are, that's great. But like, but I think in general, the organization's capacity is largely limited by what the CEO and the management team are able to oversee and manage. I think that that's just been a big focus for us. It's like, all right, keep the, as I guess Ben Horowitz says, keep the main thing, the main thing.

Mark Zuckerberg:
Right. And, and try to kind of stay focused on your key priorities.

Dwarkesh Patel:
Yeah. All right. Awesome. That was excellent, Mark. Thanks so

Mark Zuckerberg:
much. That was a lot of fun.

Dwarkesh Patel:
Yeah, really fun. Thanks for

Mark Zuckerberg:
having me.

Dwarkesh Patel:
Yep, absolutely. Hey everybody. I hope you enjoyed that episode with Mark. As you can see, I'm now doing ads. So if you're interested in advertising on the podcast, go to the link in the description.

Dwarkesh Patel:
Otherwise, as you know, the most helpful thing you can do is just share the podcast with people who you think might enjoy it. You know, your friends, group chats, Twitter, I guess threads. Yeah, hope you enjoyed and I'll see you on the next 1.